Extracting data from multiple txt files

Hello, is there any way to extract certain lines from multiple txt files that contain dialogues? I need to extract lines based on person’s name, i.e. Rich’s lines

Rich: I’m going to attend a concert on Saturday.
Do you have any special plans?
Peter: No, I’m going to relax. What did you do last weekend?
Rich: Last weekend, I went to visit my friends in San Francisco. What did you do?
Peter: I played soccer with some friends.

You can do so by adding an if statement and check if the lines start with the name you specifies.
If you don’t get my point, tell me which language you’re using and I’ll help you with the code

I am beginner in python. I do not know how to start. Also, because there are many lines for a person and there are many persons how could I specify where to stop reading and when to start again? I am thinking, maybe to create another file with persons’ names, something like index and try to extract based on this index? But still there is the problem of start and stop reading

Thank you

It’s okay you can let the program read the lines through a for loop for each line and once a line start with a different name , it will exclude that line
I can do it for you. just tell me if you wanted

Yes, please if you could, because I don’t think I can make it.

Thank you

Hi Friend
Take a look at this code
In theWantedName, add the name you wanna get
and in excludedNames, you should add the rest of the names
and remove Mohamed and Marie I just added them to test
and follow the instructions in the comments between the code

import os,re
lines = []
exportedLines = []

""" Change the string below to the name you wanna get the scenario of """

theWantedName = 'Mohamed'

""" Add the rest of the names and be careful not to include the wanted name and remember always to add the pipe | before any additional name """

excludedNames = re.compile(r'^(Peter|Marie|Rich)(:| :)')

"""" In the Open below add the path to the file that contains the whole scenario. Mine is "./wholeScenario.txt" """

with open('./wholeScenario.txt') as wholeScenario:
    lines = wholeScenario.readlines()
for line in lines:
   mo = excludedNames.search(line)
   if line.startswith(theWantedName):
       exportedLines += [line]
   elif line.startswith(theWantedName) == False and lines[lines.index(line)-1].startswith(theWantedName)== True and mo == None:
       exportedLines += [line]
with open('./%sScenario.txt' % (theWantedName),'w') as publish :
    publish.write("".join(exportedLines))        

            

I hope that helps

1 Like

Thank you very much, I will try it

1 Like

You’re Welcome
Tell me If it worked

Thank you again for the answer, I was trying to make it work without success, actually it is a bit more complicated than I expected because there are more than 200 persons in the files and there is first and last name and more than one line for everyone. I need to extract lines to separate file for every person what he/she says and give to each file the corresponding name.

For example:
One of multiple input txt files
1.Rich Smith : I’m going to attend a concert on Saturday.
Do you have any special plans?
2.Peter Aderson : No, I’m going to relax. What did you do last weekend?
3.Rich Smith: Last weekend, I went to visit my friends in San Francisco. What did you do?
4.Peter Aderson: I played soccer with some friends.
5.Mary Sarah: Hello

6.John Daisa: hi

I need to have for output:

Rich.txt
Rich Smith : I’m going to attend a concert on Saturday
Rich Smith: Last weekend, I went to visit my friends in San Francisco. What did you do?

Peter.txt
Peter Aderson: No, I’m going to relax. What did you do last weekend?
Peter Aderson: I played soccer with some friends.

Mary.txt
Mary Sarah: Hello

John.txt
John Daisa: hi

I am thinking to have another file with the list of names and somehow to check it with the dialogue files but still it is complicated.
I am beginner in python, and this looks impossible to me.

It’s easy don’t worry
I’m just tired now of coding
I’ll do it for you later
If you use Facebook add me, there will be better to chat over it

OK, I didn’t have Facebook, I just created one and I am going to add you.

Thank you