Features Matching using Natural Language Processing


Muhammad Danial Khilji, Department of Data Science, choreograph, United Kingdom


The feature matching is a basic step in matching different datasets. This article proposes shows a new hybrid model of a pretrained Natural Language Processing (NLP) based model called BERT used in parallel with a statistical model based on Jaccard similarity to measure the similarity between list of features from two different datasets. This reduces the time required to search for correlations or manually match each feature from one dataset to another.


BERT, Cosine similarity, features, matching, semantic similarity, similarity