Scripting with Python
This section describes both how to create simple Python lines of code in separated files commonly called scripts, and to execute them. Some practical exercises, appropriate to the content of each action, are proposed, and test exercises are suggested at the end of the section.
To practice, you have first to create a new folder in your computer, like
on your 'Desktop' for example; with name pythonscripts
for example,
and to execute the python IDLE.
For an advanced use of Python, the installation of a dedicated IDE is beneficial. SPPAS is developed thanks to the very powerful PyCharm software tool: See the PyCharm Web Page
Comments and documentation
Comments are not required by the program to work. But comments are necessary! Comments are expected to be appropriate, useful, relevant, adequate and always reasonable.
# This script is doing this and that.
# It is
under the terms of a license.
# and I
can continue to write what I want after the # symbol
# except
that it's not the right way to tell the story of my life
The documentation of a program complements the comments. Both are not sharing the same goal: comments are used in all kinds of programs, but documentation is appended to comments for the biggest programs and/or projects. Documentation is automatically extracted and formatted thanks to dedicated tools. Documentation is required for sharing the program. See the Docstring Conventions for details. Documentation must follow a convention like, for example, the markup language reST - reStructured Text. Both conventions are used into SPPAS API, programs and scripts.
Getting started with scripting in Python
In the IDLE, create a new empty file either by clicking on
File
menu, then New File
, or with the shortcut
CTRL
+N.
Copy the following line of code in this newly created file:
Then, save the file in the pythonscripts
folder. By
convention, Python source files end with a .py extension, and
so the name 01_helloworld.py
could be fine.
To execute the program, you can do one of:
- with the mouse: Click on the Menu
Run
, thenRun module
- with the keyboard: Press F5
The expected output is as follows:
A better practice while writing scripts is to describe who, what
and why this script was done. A nifty trick is to create a skeleton for
any future script that will be written. Such a ready-to-use script is
available in the SPPAS package with the name
skeleton.py
.
Blocks
Blocks in Python are created from the indentation. Tab and spaces can be used but using spaces is recommended.
>>>if a
== 3:
#
this is a block using 4 spaces for indentation
... print("a is 3") ...
Functions
Simple function
A function does something: it stats with its definition then is followed by its lines of code in a block.
Here is an example of function:
def
print_vowels():
""" Print the list of
French vowels on the screen. """
vowels = ['a', 'e', 'E', 'i', 'o', 'u', 'y', '@', '2', '9', 'a~', 'o~', 'U~']
print("List of French vowels:")
for v in vowels:
print(v)
What the print_vowels()
function is doing? This function
declares a list with name vowels
. Each item of the list is
a string representing a vowel in French encoded in X-SAMPA. Of course,
this list can be overridden with any other set of strings. The next line
prints a message. Then, a loop prints each item of the list.
At this stage, if a script with this function is executed, it will
do… nothing! Actually, the function is created, but it must be invoked
in the main function to be interpreted by Python. The main
is as follows:
Practice: create a copy of the file
skeleton.py
, then make a function to printHello World!. (solution: ex01_hello_world.py).
Practice: Create a function to print plosives and call it in the main function (solution: ex02_functions.py).
One can also create a function to print glides, another one to print affricates, and so on. Hum… this sounds a little bit fastidious!
Function with parameters
Rather than writing the same lines of code with only a minor difference over and over, we can declare parameters to the function to make it more generic. Notice that the number of parameters in a function is not limited!
In the example, we can replace the print_vowels()
function and the print_plosives()
function by a single
function print_list(mylist)
where mylist
can
be any list containing strings or characters. If the list contains other
typed variables like numerical values, they must be converted to string
to be printed out. This can result in the following function:
def
print_list(mylist, message=" -"):
""" Print a list on
the screen.
:param mylist: (list) the list to
print
:param message: (string) an optional
message to print before each element
"""
for item in mylist:
print("{:s} {:s}".format(message,
item))
Function return values
Functions are used to do a specific job, and the result of the function can be captured by the program. In the following example, the function would return a boolean value, i.e., True if the given string has no character.
Practice: Add this function in a new script and try to print various lists (solution: ex03_functions.py)
Reading/Writing files
Reading data from a file
Now, we’ll try to get data from a file. Create a new empty file with
the following lines, and add as many lines as you want; then, save it
with the name phonemes.csv
by using UTF-8 encoding:
occlusives ; b ; b
occlusives ; d ; d
fricatives ; f ; f
liquids ; l ; l
nasals ; m ; m
nasals ; n ; n
occlusives ; p ; p
glides ; w ; w
vowels ; a ; a
vowels ; e ; e
The following statements are typical statements used to read the
content of a file. The first parameter of the open
function
is the name of the file, including the path (relative or absolute); and
the second argument is the opening mode (r
is the default value,
used for reading).
Practice: Add these lines of code in a new script and try it (solution: ex04_reading_simple.py)
fp = open("phonemes.csv", 'r')
for line in fp:
# do something with the line stored
in variable l
print(line.strip())
f.close()
The following is a solution with the ability to deal with various
file encodings, thanks to the codecs
library:
def
read_file(filename):
""" Get the content
of file.
:param filename: (string) Name of
the file to read, including its path.
:returns: List of
lines
"""
with codecs.open(filename, 'r', encoding="utf8") as fp:
return
fp.readlines()
In the previous code, the codecs.open
functions got three
parameters: the name of the file, the mode to open, and the encoding.
The readlines()
function gets each line of the file and
store it into a list.
Practice: Write a script to print the content of a file (solution: ex05_reading_file.py)
Notice that Python os
module provides useful methods to
perform file-processing operations, such as renaming and deleting. See
Python documentation for details: https://docs.python.org/3.8/
Writing data to a file
Writing a file requires opening it in a writing mode:
w
is the mode to write data; it will erase any existing file;a
is the mode to append data in an existing file.
A file can be opened in an encoding and saved in another one. This could be useful to write a script to convert the encoding of a set of files. The following could help to create such a script:
# Converting the
encoding of a file:
file_stream = codecs.open(file_location, 'r', file_encoding)
file_output = codecs.open(file_location+'utf8',
'w', 'utf-8')
for line in file_stream:
file_output.write(line)
Python tutorials
Here is a list of websites with tutorials, from the easiest to the most complete:
Exercises to practice
Exercise 1: How many vowels are in a list of phonemes? (solution: ex06_list.py)
Exercise 2: Write a X-SAMPA to IPA converter. (solution: ex07_dict.py)
Exercise 3: Compare 2 sets of data using NLP techniques (Zipf law, Tf.Idf) (solution: ex08_counter.py)