Using JSON Files in Python and applying multilabel classification problem

Question

Welcome To Ask or Share your Answers For Others

Using JSON Files in Python and applying multilabel classification problem

asked Jan 27, 2021 in Technique[技术] by 深蓝 (71.8m points)

Using JSON Files in Python and applying multilabel classification problem

I'm quite new in python and I try to resolve a problem. I have a JSON file and I need to make a multilabel classification on it.I decided to use tfidf My JSON file looks like this training_data

My techniques files is like thisTechniques

Can someone give me some tips in order to preprocess the data?

Data for testing looks like this Testing Data

with open(propaganda_techniques_file,'r') as f:
    techniques = [ line.rstrip() for line in f.readlines() if len(line)>2 ]

# Read data from training_set_task1
try:
    with open(training_file, "r", encoding='utf-8') as f:
        json_obj = json.load(f)
except:
    sys.exit("ERROR: cannot load json file")

try:
    with open(test_file,'r',encoding='utf-8') as f:
        json_test = json.load(f)
except:
    sys.exit("Error")


tech_list = []
text_list = []
i=0
for example in json_obj:
    while i < len(json_obj):
        tech_list.append(json_obj[i]['labels'])
        text_list.append(json_obj[i]['text'])
        i+=1

j=0
test_text = []
for ex in json_test:
    while j < len(json_test):
        test_text.append(json_test[j]['text'])
        j+=1

vec_train = TfidfVectorizer()
X_train = vec_train.fit_transform(text_list)
y_train = tech_list


vec_test = TfidfVectorizer()
X_test = vec_test.fit_transform(test_text)

clf = LogisticRegression(penalty='l2', multi_class = 'multinomial',solver ='newton-cg')

y_pred = clf.predict(X_test)

this is my code for now, but I get an error message: his LogisticRegression instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

I tried to split my testing data into 2 parts, labels, and text. So I will have as an input the text and the output will be the label(I don't know if it's a good approach).

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

Using JSON Files in Python and applying multilabel classification problem

Using JSON Files in Python and applying multilabel classification problem

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags