2049 - 2020GEC1506 - Weightings Scoreboard

Time

2020/06/02 23:30:00 2020/06/09 12:00:00

Clarification

# Problem Asker Description Reply Replier Reply Time For all team

# Problem Pass Rate (passed user / total user)
12309 WF-ILF_Python (Advanced)

12309 - WF-ILF_Python (Advanced)   

Description

TF-IDF is a popular way to find important keywords from documents. In this problem, you are going to implement a simple version of it, namely WF-ILF.

Given a few lines of text, you need to parse it and perform the following requirements.

Hint: Please use sys.stdin instead of input() in this homework as there are multiple lines, else you will not be accepted!!!

Input

In this assignment, a few lines of text will be given as an input.

Output

You will need to read the given input and calculate the WF-ILF of each word that appear in the sentences.

Then, print out only the bottom 3 as your output results.

 

WF-ILF  = WF * ILF

 

WF(Word Frequency): Word Frequency in Input lines.

ILF(Inverse Line Frequency): Number of Lines/Number of Lines include the word

For example,

Example

If we get two lines of Input:

'Hello Hello John'

'Hello Bob'

WF value:

'Hello' is equal to 3      

'John' is equal to 1

'Bob' is also equal to 1

ILF value:

'Hello' is equal to 2/2 = 1  (Number of  line = 2 and 'Hello' appears in these 2 lines)

'John' is equal to 2/1 = 2   (Number of  line = 2 and 'John' appears in 1 line)

'Bob' is equal to 2/1 = 2    (Number of  line  = 2 and 'Bob' appears in 1 line)

 

WF-ILF value:

'Hello' is equal to 3 * 1 = 3

'John' is equal to 1 * 2 = 2

'Bob' is equal to 1 * 2 = 2

After we get WF-ILF value of each word, we choose the bottom 3

(If values are the same, the order should depends on their appearance order)

Therefore, the order of above example is John', 'Bob', 'Hello'

 

Sample Input  Download

Sample Output  Download

Tags




Discuss