in Education by
The classifiers in machine learning packages like liblinear and nltk offer a method show_most_informative_features(), which is really helpful for debugging features: ****** = None ok : spam = 4.5 : 1.0 hello = True ok : spam = 4.5 : 1.0 hello = None spam : ok = 3.3 : 1.0 ****** = True spam : ok = 3.3 : 1.0 casino = True spam : ok = 2.0 : 1.0 casino = None ok : spam = 1.5 : 1.0 My question is if something similar is implemented for the classifiers in scikit-learn. I searched the documentation, but couldn't find anything the like. If there is no such function yet, does somebody know a workaround how to get to those values? Thanks a lot! Select the correct answer from above options

1 Answer

0 votes
by
 
Best answer
You can extract your features using a Vectorizer/CountVectorizer/TfidfVectorizer/DictVectorizer, and you are using a linear model. You can use this code for binary classification: def show_most_informative_features(vectorizer, clf, n=20): feature_names = vectorizer.get_feature_names() coefs_with_fns = sorted(zip(clf.coef_[0], feature_names)) top = zip(coefs_with_fns[:n], coefs_with_fns[:-(n +1):-1]) for (coef_1, fn_1), (coef_2, fn_2) in top: print("\t%.4f\t%-15s\t\t%.4f\t%-15s" %(coef_1, fn_1, coef_2, fn_2)) Hope this answer helps.

Related questions

0 votes
    Today I'm trying to learn something about K-means. I Have understood the algorithm and I know how it works. Now I ... a lot of time? Select the correct answer from above options...
asked Jan 29, 2022 in Education by JackTerrance
0 votes
    How can I extract the decision path as a textual list from a trained tree in a decision tree ? Something similar to ... then class='A' Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
0 votes
    How can I save a trained Naive Bayes classifier to a disk and use it for predicting data? Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me why scikit learn is used? Select the correct answer from above options...
asked Jan 10, 2022 in Education by JackTerrance
0 votes
    Can anyone tell me what is test size in Scikit learn? Select the correct answer from above options...
asked Jan 10, 2022 in Education by JackTerrance
0 votes
    I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification ... labels? Select the correct answer from above options...
asked Jan 31, 2022 in Education by JackTerrance
0 votes
    Can I specify my own distance function using scikit-learn K-Means Clustering? Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
0 votes
    What is the difference between the two? It seems that both create new columns, in which their number is equal to ... they are in. Select the correct answer from above options...
asked Feb 1, 2022 in Education by JackTerrance
0 votes
    While training a tensorflow seq2seq model I see the following messages : W tensorflow/core/common_runtime/gpu/pool_allocator ... GB GPU Select the correct answer from above options...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    While training a tensorflow seq2seq model I see the following messages : W tensorflow/core/common_runtime/gpu/pool_allocator ... GB GPU Select the correct answer from above options...
asked Feb 5, 2022 in Education by JackTerrance
0 votes
    Classification problems, such as logistic regression or multinomial logistic regression, optimize a cross-entropy loss. ... jungle. Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I am trying to groupby a column and compute value counts on another column. import pandas as pd dftest = pd. ... Amt, already exists Select the correct answer from above options...
asked Feb 1, 2022 in Education by JackTerrance
0 votes
    I have just built my first model using Keras and this is the output. It looks like the standard output you get ... - loss: 0.1928 Select the correct answer from above options...
asked Feb 1, 2022 in Education by JackTerrance
0 votes
    I'm starting with input data like this df1 = pandas.DataFrame( { "Name" : ["Alice", "Bob", "Mallory", ... Any hints would be welcome. Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I can't decipher however the sklearn.pipeline.Pipeline works precisely. Some explanation in this documentation. What ... estimator)]) Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
...