Word frequency count from a file with Python

Word frequency count from a file with Python
0

Hello, I am new here to this forum, I need some help. I have written a python code to count word frequency from a file into a dictionary. However, for some reason it is counting every word more than it appears in the file:
For example, word “creating”:4, but ts frequency in the file is 3.Below is my code, I will appreciate if someone help me point out the possible error in my code.

def word_frequencies(filename="src/alice.txt"):
    d = {}
    with open(filename, 'r') as f:
        for line in f:
            line = line.lower()
            line = line.split()
            stripped = [x.strip('''!"#$%&'()*,-./:;?@[]_''') for x in line]
          

            for word in stripped:
                try:
                    d[word] += 1
                except KeyError:
                    d[word] = 1
               
    return d

it works for me. Could you post the contents of the word file?

The Project Gutenberg EBook of Alice in Wonderland, by Lewis Carroll

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org


Title: Alice in Wonderland

Author: Lewis Carroll

Illustrator: Gordon Robinson

Release Date: August 12, 2006 [EBook #19033]

Language: English

Character set encoding: ASCII

*** START OF THIS PROJECT GUTENBERG EBOOK ALICE IN WONDERLAND ***




Produced by Jason Isbell, Irma Spehar, and the Online
Distributed Proofreading Team at http://www.pgdp.net









          [Illustration: Alice in the Room of the Duchess.]


                       _THE "STORYLAND" SERIES_



                   ALICE'S ADVENTURES IN WONDERLAND







                     SAM'L GABRIEL SONS & COMPANY

                               NEW YORK



                           Copyright, 1916,

                   by SAM'L GABRIEL SONS & COMPANY

                               NEW YORK




ALICE'S ADVENTURES IN WONDERLAND

[Illustration]




I--DOWN THE RABBIT-HOLE

This is a part of the file’s content.

“creating” is indeed present 4 times in the txt file. Notice that the first occurence has upper-case initial: “Creating”. Maybe you are checking by manually searching, but you are using a case-sensitive search?

Got it. Thanks for pointing out this minor confusion. It works great now!