Word embedding 词嵌入

发布于:2022-10-29 ⋅ 阅读:(486) ⋅ 点赞:(0)

Word embedding

embedding:
An embedding is a relatively low-dimensional space into which you can translate high-dimensional vectors. Embeddings make it easier to do machine learning on large inputs like sparse vectors representing words.

在这里插入图片描述

Why use an embedding layer?
The vector encoded using the One-hot method will be very high dimensional and sparse. Suppose we come across a dictionary of 2000 words in natural language processing (NLP). When using One-hot encoding, each word is represented by a vector of 2000 integers, of which 1999 numbers are zeros. If the dictionary were any larger, this method would be much less computationally efficient.

Word embeddings can be thought of as an alternate to one-hot encoding along with dimensionality reduction.

Word Embeddings
It is an approach for representing words and documents. Word Embedding or Word Vector is a numeric vector input that represents a word in a lower-dimensional space. It allows words with similar meaning to have a similar representation. They can also approximate meaning.

Goal of Word Embeddings

  • To reduce dimensionality
  • To use a word to predict the words around it
  • Inter word semantics must be captured

How are Word Embeddings used?

  • They are used as input to machine learning models.
  • Take the words —-> Give their numeric representation —-> Use in training or inference
  • To represent or visualize any underlying patterns of usage in the corpus that was used to train them.

References

  1. Word Embeddings in NLP
  2. Understanding Embedding Layer

网站公告

今日签到

点亮在社区的每一天
去签到