Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.1k views
in Technique[技术] by (71.8m points)

regex - Extract occurrence of text between brackets from a text file Python

Log file:

INFO:werkzeug:127.0.0.1 - - [20/Sep/2018 19:40:00] "GET /socket.io/?polling HTTP/1.1" 200 -
INFO:engineio: Received packet MESSAGE, ["key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}]

I'm interested in extracting only the text from with in the brackets which contain the keyword "key" and not all of the occurrences that match the regex pattern from below.

Here is what I have tried so far:

import re
with open('logfile.log', 'r') as text_file:
    matches = re.findall(r'[([^]]+)', text_file.read())
    with open('output.txt', 'w') as out:
        out.write('
'.join(matches))

This outputs all of the occurrences that match the regex. The desired output to the output.txt would look like this:

"key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

To match text within square brackets that cannot have [ and ] inside it, but should contain some other text can be matched with a [^][] negated character class.

That is, you may match the whole text within square brackets with [[^][]*], and if you need to match some text inside, you need to put that text after [^][]* and then append another occurrence of [^][]* before the closing ].

You may use

re.findall(r'[([^][]*"key"[^][]*)]', text_file.read()) 

See the Python demo:

import re
s = '''INFO:werkzeug:127.0.0.1 - - [20/Sep/2018 19:40:00] "GET /socket.io/?polling HTTP/1.1" 200 - 
INFO:engineio: Received packet MESSAGE, ["key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}]'''
print(re.findall(r'[([^][]*"key"[^][]*)]', s)) 

Output:

['"key",{"data":{"tag1":12,"tag2":13,"tag3": 14"...}}']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...