k-NN Embedding Stability for word2vec Hyper-Parametrisation in Scientific Text

Dridi, Amna and Gaber, Mohamed Medhat and Azad, R. Muhammad Atif and Bhogal, Jagdev (2018) k-NN Embedding Stability for word2vec Hyper-Parametrisation in Scientific Text. In: Discovery Science, October 29–31, 2018, Limassol, Cyprus.

Camera Ready Paper 59.pdf

Download (370kB)


Word embeddings are increasingly attracting the attention of researchers dealing with semantic similarity and analogy tasks. However, finding the optimal hyper-parameters remains an important challenge due to the resulting impact on the revealed analogies mainly for domain-specific corpora. While analogies are highly used for hypotheses synthesis, it is crucial to optimise word embedding hyper-parameters for precise hypothesis synthesis. Therefore, we propose, in this paper, a methodological approach for tuning word embedding hyper-parameters by using the stability of k-nearest neighbors of word vectors within scientific corpora and more specifically Computer Science corpora with Machine learning adopted as a case study. This approach is tested on a dataset created from NIPS (Conference on Neural Information Processing Systems) publications, and evaluated with a curated ACM hierarchy and Wikipedia Machine Learning outline as the gold standard. Our quantitative and qualitative analysis indicate that our approach not only reliably captures interesting patterns like ``unsupervised_learning is to kmeans as supervised_learning is to knn'', but also captures the analogical hierarchy structure of Machine Learning and consistently outperforms the $$61\backslash%$$61%sate-of-the-art embeddings on syntactic accuracy with $$68\backslash%$$68%.

Item Type: Conference or Workshop Item (Speech)
24 July 2018Accepted
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Depositing User: Mohamed Gaber
Date Deposited: 29 Oct 2018 14:27
Last Modified: 22 Mar 2023 12:01
URI: https://www.open-access.bcu.ac.uk/id/eprint/6501

Actions (login required)

View Item View Item


In this section...