Contents

Order Sentences from Short to Long

According to current Covid-19 controlling situation in our country, my wife and daughter both have a great mind to go abroad.

Learning English and get ideal IELTS scores are a possible and realistic way to achieve this goal.

The first obstable seemed to recite sentences by heart. Unfortunately, it’s not.

When I scratched dozens of sentences from YouTube in an English and Chinese spaced format, I found my little princess can not read the first sentence in a while through my help.

Background

https://doraemonj.github.io/pics/screenshot_20220514_112220.png

Fig 1: Original file: Not ordered

The first sentence seemed a bit long for a ten-year-old Chinese girl.

So I use Python to find the shortest sentence out, and put them to the upmost position.

Code

 1# Mission: Filter the sentences with the fewest words to the upmost
 2
 3# Task 1: Input -- Read the md file by lines. ".md" format is a same way of ".txt".
 4path = r"/Users/tangqiang/private/"         # set path
 5file_name = "IELTS900.md"                   # set file name
 6with open(path+file_name) as f:             # open IELTS900.md file
 7    txt = f.readlines()                     # read this file and put the whole content to a list variable named "txt"
 8
 9# Task 2: Process -- 
10dry = []                                    # set an empty list variable to filter the blank lines
11for el in txt:                              # iterate all elments in txt
12    if len(el.strip()) == 0:                # blank lines, like space, enter,etc... need not to be recorded
13        pass
14    else:                                   # record ordinary words to "dry" list
15        dry.append(el.strip())              # strip() function used to delete the blank string before and after the word.
16
17
18# The function part should be put in front of the code, we put here to make it more apprehensible
19def is_contains_chinese(strs):
20    """
21    check all the characters in the string, if any one belongs to Chinese, return True.
22    """
23    for _char in strs:
24        if '\u4e00' <= _char <= '\u9fa5':   #
25            return True
26    return False
27# The function part should be put in front of the code
28
29# Split English and Chinese into two parts
30res_en = []                                     # set English sentence result list
31res_cn = []                                     # set Chinese sentence result list
32for i, el in enumerate(dry):
33    if not is_contains_chinese(el):             # if not contains Chinese
34        res_en.append(el.strip())
35    else:
36        res_cn.append(el.strip())
37
38dic = dict(zip(res_en, res_cn))                 # put two lists into one dictionary, then specific sentence can be found easily.
39new_en = sorted(res_en, key=len)                # sort the english sentence by length
40
41# Task 3: Output -- write the new ordered sentences to a new file
42for el in new_en:
43    with open(path + "ordered_" + file_name, "a") as f: # open a new file with "a" mode which means open then append new strings at the end of the file
44        f.write(f"{el}\n")                      # English sentence from short to long
45        f.write(f"{dic[el]}\n\n")               # Chinese sentence corresponds to English ones

Effect

At last, we got what we need: from short sentence to long.

https://doraemonj.github.io/pics/screenshot_20220514_113746.png

Fig 2: Ordered file from short to long

Attachment

IELTS900.md

ordered_IELTS900.md