User Profile
Collapse
-
Computation of cosine similarity over a list of values using python
Below are my code lines for computing cosine similarity over a list of values. My goal is to compute the cosine similarity by comparing each value in the f-list( f=[[3492.6], [13756.2], [22442.1], [22361.9], [26896.4]]) with the rest values and output their similarity scores. However, for some reasons, I keep getting 1.0 as the cosine similarity even when I tested the code on other data sets. Obviously, [22361.9] is more similar to [22442.1] than... -
Hi bvdet, thanks for the reply. Your solution works fine. I have one more question for you: how do I create a loop that will enable me compute 10, 20,.....,90% of the text in the document at once, instead of computing a single percentage each time?
Thanks pal. -
Hi dwblas, sorry i couldn't communicate the outcome of the above solution in time. I tried using float as you suggested but I was still getting the same error message. Can you just try the code on text document to see if it will work? if it works, just let me know the version of python you are using. Thanks in advance.Leave a comment:
-
-
Stop-word removal and document tokenization using NLTK library
I’m having difficulty eliminating and tokenizing a .text file using nltk. I keep getting the following error message: AttributeError: 'list' object has no attribute 'lower'. I just can’t figure out what I’m doing wrong, although it’s my first time of doing something like this. Below are my lines of code.I’ll appreciate any suggestions, thanks.
...Code:Import nltk from nltk.corpus import stopwords s = open("C:\zircon\sinbo1.txt").read() -
Hi bvdet, thanks for your response. You've really been of help to me. Your code worked, but only when x is 100 and above. It does not slice through the text (can't produce 90%, 80,--10% of the text). Besides I would like to loop through in order to get the above proportions of the text. I keep getting the following error: TypeError: unsupported operand type(s) for /: 'list' and 'int'. Below are my lines of code.
...Code:x = range (10,100, 10)
Leave a comment:
-
Hi bvdet, thanks for the response. The code worked, but the content of the files were not displayed, only the file names. Is there a way display the contents of the files in the directory? as another module would want to read the content of the files and compare them.Leave a comment:
-
How to open and read through multiple files in a directory with python
I can open and read the content of a single text file using the following line of code.
How can I read through multiple files in a directory? as I would like to compare them based on the similarity of their content (textual content). Any suggests? thanks.Code:f = open('C:\\xyz.text', 'r') f.read() -
Hi bvdet, x is a number, and filename is a sequence of characters (string). Do I have to explicitly define them?Leave a comment:
-
Hi Rabbit, you are right, I'm trying to multiply a string by a number. I came across something that has to do with converting to a float first b4 multiplying, but i'm trying to figure out how to do this conversion. Any ideas?Leave a comment:
-
Opening and processing text in files
I'm having problem calculating various percentages of text in a text file. Below are my codes. Will appreciate any help, thanks.
Code:def pecent (pento): pento = x/100 * filename text = open ("C:\\fname.txt", 'r') filename = text.split() x = [10, 20, 30, 40, 50, 60, 70, 80, 90] for z in x: print pento
No activity results to display
Show More
Leave a comment: