Examples
- I am an historian and I need to go through 3000 documents to find specific people and list the sentences their names appear in. I can then read through those sentences manually to find which ones are relevant to my research.
- Input
- Find the data i.e. where is it located? Can you access it?
- In which format is the data i.e. text files, pdfs, physical books?
- Processing
- Read the files
- Split the files into sentences
- Loop through each sentence and save those which have the name to a list
- Output
- Save those lists to a file which can be read or printed out
- Input
- I’m a linguist and I need to extract all the verbs from a document, count them and display the top 10 occurring verbs. I will use these top 10 verbs to determine what the document is about.
- Input
- Find the document i.e. where is located?
- Determine what file type the document is.
- Processing
- Read the contents of the document
- Separate all the words
- Filter out anything that is not a verb
- Count the remaining verbs
- Sort them from most frequency to least frequent
- Output
- Display the top 10 elements of the sorted list
- Input