Ryan's Manuals

Python

Children and children disguised as grad students love Python!


Contents



Hello Python

import antigravity
print("Hello, Python!")

Why use Python?

Easy for humans to read, easy to write c extensions, simple for beginners, complex enough to enable machine learning.

Stop here, this manual is incomplete! New sections will be filled out soon, when I trace back my steps and re-learn the language properly. For now, it is an empty shell. I focus primarily on Python 3.X in my writing, though I originally learned Python 2.

Quality Assurance frequently requires one of the following:

  1. Excel spreadsheet manipulation.
  2. Clicking through user interfaces.
  3. Writing documentation based on tabular system information.

All of these use cases can see fantastic acceleration with just a little bit of automation. I leverage Pandas, Selenium, Docx and Xlrd, and other modules to move data between Excel Spreadsheets, Word Documents, and Selenium Actions.

Environment Setup

If running windows, you’ll need the latest python installer. If on a GNU/Linux distribution, you’ll be able to install python3-dev with your package manager; all tools will be installed automatically. Ensure that you have the pip package manager installed by running pip3 -v.

Most of my python is written either in VIM or VS Code with the AutoPEP8 linter. You’ll need pylint installed to use the VS Code extension.

# Install whatever you want.
pip3 install --upgrade pip
pip3 install jupyter pylint pandas --user
# Ensure .local/bin is in your PATH.

Recommended Learning Path

If programming for the very first time, I’d recommend taking CodeCademy’s Learn Python 3 course, ignoring the Pro materials.

Pandas Data Analysis

Pandas is a Python library for creating and manipulating dataframes, allowing for easy organization and processing of data. Xlrd enables easy import of excel files into Pandas Dataframe format, and manipulation is easy from there.

Excel Table Container

This simple container class holds a spreadsheet and make it easier to access and manipulate individual sheets. It is part of rcf_lib.py

class TableGlob:

    filepath = ""

    def __init__(self, name):
        self.name = name

    def importExcelTable(self, filepath):
        if not self.filepath:
            if( os.path.isfile(str(filepath)) and str(filepath[-4:]) == 'xlsx'):
                dbg(e,"good","TblGlb","Data found, importing {0}...".format(filepath))
                self.tablefile = pd.ExcelFile(filepath)
                self.sheetlist = self.tablefile.sheet_names
                dbg(e,"good","TblGlb","Sheets: {0}".format(self.sheetlist))
                self.filepath = filepath
                return True
            else:
                dbg(e,"fail","TblGlb","Path incorrect or table invalid!")
                return False
        else:
            dbg(e,"fail","TblGlb","Table already imported!")
            return False

    def sheets(self):
        return self.sheetlist

    # Returns a pandas dataframe from the excel sheet specified.
    def getdf(self, sheetid):
        if str(sheetid) in self.sheetlist:
            return self.tablefile.parse(sheetid)

    # Mostly for ensuring import was successful.
    def dataShapeTuple(self):
        tuples = []
        for sheetid in self.sheetlist:
            temptable = self.getdf(str(sheetid))
            tuples.append((temptable.shape[0],temptable.shape[1]))

        dbg(e,"good","TblGlb","Data Shape (ROW,COL): {0}".format(tuples))
        return tuples

Documentation Generator

The following code strips data from tables generated by the Microsoft TFS work tracking bureaucratic engine, and reassembles it into partial documentation.

#!/bin/python3
'''
TFSDataToSectionalTemplateV2.py

2018 04

This program simply strips useful data from an excel file
and structures it as rich text for placement into a word
document for Release Documentation.

Changelog:
V0:
- Began development.
- Implemented excel spreadsheet import.
- Implemented excel spreadsheet processing.
- Implemented command line output.
V1:
- Added docx package.
- Implemented stylized word document export.
V2:
- Added 'column order resilience,' spreadsheet can be less structured.
- Improved table output.
- Added more output and instructions for coworkers.
'''
import os
import pandas as pd
from docx import Document
from docx.shared import Inches


#COLUMN VARIABLES:
cID=0
cITEMTYPE=0
cTITLE=0


#SETUP AND SPREADSHEET IMPORT:
def excelData():
  spreadsheet = pd.ExcelFile('tfsdata.xlsx')
  dataframe = spreadsheet.parse('Sheet1')
  return(dataframe)

def excelOutputSetup():
  writer = pd.ExcelWriter('formattedOutput.xlsx', engine='xlsxwriter')
  return(writer)

print("TFS Excel To Word Template Translator")
print("Ensure this tool is placed in the same directory as 'tfsdata.xlsx'")

if(os.path.isfile('tfsdata.xlsx')):
  print("Data found, processing.")
  data = excelData()
else:
  print("ERROR: 'tfsdata.xlsx' not found! Exiting.")
  exit()



#Parse columns:
print("Data shape: x:"+str(data.shape[0])+" y:"+str(data.shape[1]))
titles=data.columns.tolist()
print(titles)
index=0
for title in titles:
  if(title == 'ID'):
    cID=index
  elif(title == 'Work Item Type'):
    cITEMTYPE=index
  elif(title == 'Title'):
    cTITLE=index
  index+=1



#CREATE DOCUMENT
if(os.path.isfile("Release_Template.docx")):
  os.remove("Release_Template.docx")

doc = Document("redactedT.docx")
styles=doc.styles
heading=doc.add_heading()



#TABLE WRITER
def writeTable():
  if(len(featureDict)>0):
    #heading.style = doc.styles['redacted Heading 2 - No Number']
    doc.add_heading('Related User Stories', level=2)
    table = doc.add_table(rows=1, cols=3)
    table.style = doc.styles['Grid Table 2 Accent 2']
    table.autofit = True
    table.columns[0].width = Inches(0.5)
    table.columns[1].width = Inches(1)
    table.columns[2].width = Inches(6)
    hdr_cells = table.rows[0].cells
    hdr_cells[0].text = 'ID'
    hdr_cells[1].text = 'Work Item Type'
    hdr_cells[2].text = 'Title'
    for key in featureDict:
      row_cells = table.add_row().cells
      row_cells[0].text = str(key)
      row_cells[1].text = "User Story"
      row_cells[2].text = featureDict[key]



#SECTION BODY CONTENT WRITER
def fillSection():
  doc.add_paragraph("This is a line of filler body content for a generated redacted document template. TODO: REPLACE ME.")

def fillBody():
  heading.style = doc.styles['redacted Heading 2 - No Number']
  doc.add_heading('Additional Details One', level=2)
  fillSection()
  doc.add_heading('Additional Details Two', level=2)
  fillSection()



#READ THROUGH TABLE AND WRITE DOCUMENT
print("\n\tWRITING DATA TO OUTPUT.DOCX\nREF#\tTITLE\n")

tableLock=True
makeTable=False
featureDict={}

for x in range(0,len(data.index)):
  if(str(data.iloc[x,cITEMTYPE])=="Feature"):
    #Print table for previous dict to doc:
    writeTable()
    fillBody()
    #doc.add_page_break()
    featureDict={}
    print("\n\n"+str(data.iloc[x,cID])+"\t"+str(data.iloc[x,cTITLE])+"\n\t"+("-"*(len(str(data.iloc[x,cTITLE])))))
    doc.add_heading(str(data.iloc[x,cTITLE]), level=1)
    tableLock=False


  elif(tableLock==False and str(data.iloc[x,cITEMTYPE])=="User Story"):
    print(str(data.iloc[x,cID])+"\t"+str(data.iloc[x,cTITLE]))
    featureDict[str(data.iloc[x,cID])] = str(data.iloc[x,cTITLE])
  else:
    tableLock=True



#SAVE AND CLEAN UP
writeTable()
doc.save("redacted_GeneratedReleaseX.docx")
print("\n\tGENERATION COMPLETE - SAVED AS 'redacted_GeneratedReleaseX.docx'")
print("\n\tTODO:\n\t\t-Generate table of contents.\n\t\t-Add full list of user stories.\n\t\t-Rename document.")

UnitTest

UnitTest is a simple module to assert that your code is functioning as designed. If building a library, the following snippet can be added. When methods are run from an external python file, nothing will happen. When run directly, the unit tests will run. Quite useful:

if __name__ == '__main__':
  unittest.main()

Multi-Language Testing w/Selenium

I wrote the following test a few months ago to test if mulitilingual pages were accessible.

The program generates and appends unit tests to a class based on dictionary entries. I’m hoping to adapt it to read from an excel table and run test cases based on the table content (With the help of a human QA.)

import unittest
import selenium
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
import time
import sys
from pyfiglet import figlet_format
from termcolor import cprint

#Options:
rev = "6.13.2"
debug = True #Set to FALSE if production.
#root = "redacted-cas-01.redacted.com" #the environment to test, without languages or cs addons.
root = "redacted.ca"
chrome_options = Options()
chrome_options.add_argument("--window-size=200,")
chrome_options.add_argument("disable-infobars")
chrome_options.add_argument("--disable-notifications")
#chrome_options.add_argument("--headless") #use when development is finished.

#Global vars
page = ""
headline_EN = "" #Or targeted term.
headline_FR = "" #Or targeted term.

def setposition(x):
    #x.browser.set_window_position("100","600")
    x.browser.set_window_position("50","600")
    return(0)

def setsize(x):
    #x.browser.set_window_size("1250","800") #DESKTOP
    x.browser.set_window_size("1500","800") #DESKTOP
    return(0)

#DEBUG Message functions:
#dm prints a debug message, which is hidden when debug=FALSE.
#dms prints a success message, and is included at the end of each test case.
def dm(msg):
    if (debug):
        print("\n"+'\033[93m'+"DEBUG:   "+'\033[0m'+str(msg), end='')

def dma(msg): #append to dm
    if (debug):
        print(str(msg), end='')

def dms(msg):
    print("\n"+'\033[92m'+"SUCCESS: "+'\033[0m'+str(msg))

#openPage opens the redacted homepage.
def openPage():
    dm("Opening homepage...")
    browser = webdriver.Firefox()
    browser.get('http://www.redacted.ca/')

#EN/FR test case generator:
class TEST007x_LanguageToggle_X(unittest.TestCase):
    dm("Container init.\n")

pagetests = {
             "/your-care": ["Your Care","Vos soins"],
             "/health-info": ["Public Health","sur la"],
             "/science-and-research": ["Inspiring","Faire"],
             "/education": ["Clinical","Formation"],
             "/get-involved": ["Get Involved","Impliquez-vous"],
             "/driving-change": ["Leading","agent du"],
            }

def generatePageTest(name, en, fr):
    def test(self):
        #cprint(figlet_format(str(name), font='small')) #Enable for fun.
        dm("Running w/ page:"+str(name)+" en:"+str(en)+" fr:"+str(fr)+"\n")

        dm("Opening Chrome... ")
        self.browser = webdriver.Chrome(options=chrome_options)
        setsize(self)
        setposition(self)
        dma("Browser open.")

        #Link to global vars
        global page
        global headline_EN
        global headline_FR
        #Assign test vars
        page = name #must begin with "/"
        headline_EN=en
        headline_FR=fr

        dm("Opening "+str(root)+str(page))
        self.browser.get("https://"+str(root)+"/en"+str(page))
        self.browser.execute_script("document.body.style.zoom='100%'")
        self.assertIn(headline_EN, self.browser.title)
        dm("English homepage confirmed.")

        #Click FR toggle.
        dm("Finding FR language toggle.")
        langbutton = self.browser.find_element_by_link_text('FR')
        dm("Clicking element:"+str(langbutton.text)+"... ")
        langbutton.click()
        dma("Click!")

        #Check for french page.
        self.assertIn(headline_FR, self.browser.title)
        dm("French page confirmed.")

        #Click EN toggle.
        dm("Finding EN language toggle.")
        langbutton_en = self.browser.find_element_by_link_text('EN')
        dm("Clicking element:"+str(langbutton_en.text)+"... ")
        langbutton_en.click()
        dma("Click!")

        #Check for english page
        self.assertIn(headline_EN, self.browser.title)
        dm("English page confirmed.")

        dms("Language toggles on "+str(root)+str(page)+" are functional.")

        dm("Teardown.")
        self.browser.close()

    return test

def FdynaStart():
    def test(self):
        dm("Opening Chrome... ")
        self.browser = webdriver.Chrome(options=chrome_options)
        setsize(self)
        setposition(self)
        dma("Browser open.")
    return test

def FdynaEnd():
    def test(self):
        dm("Teardown.")
        self.browser.close()
    return test

for name, params in pagetests.items():
    #dynaStart = FdynaStart()
    #dynaTear = FdynaEnd()
    pageTestFunction = generatePageTest(name, params[0], params[1])
    #setattr(TEST007x_LanguageToggle_X, 'test_{0}_01_Startup'.format(name), dynaStart)
    setattr(TEST007x_LanguageToggle_X, 'test_{0}_02_Run'.format(name), pageTestFunction)
    #setattr(TEST007x_LanguageToggle_X, 'test_{0}_03_Teardown'.format(name), dynaTear)

#Run all tests:
if __name__ == '__main__':
    print('\033[91m'+"\nredacted Test Battery One\nredacted")
    print("[RCF] Version: "+str(rev)+'\033[0m'+"\n\n")
    unittest.main(verbosity=2)

GUIs.

GUIS, or Graphical User Interfaces, are designed to allow the general public to interact with the software running beneath.

QT5

https://build-system.fman.io/pyqt5-tutorial

TK

By default, Python uses TKinter (TK Interface) to create simple GUIs. Below is an simple example of a GUI with a window, a label, and a button.

import tkinter as tk
top = tk.Tk()

l = tk.Label(top, text="Tiny GUI")
l.pack()

b = tk.Button(top,text="Button!")
b.pack()

top.title("RCF TK-Interface")
top.resizable(width=False, height=False)
top.geometry('{}x{}'.format(300, 60))
top.mainloop()

MongoDB

After installing MongoDB and loading http://media.mongodb.org/zips.json into the database, (or another from https://github.com/ozlerhakan/mongodb-json-files,)

I decided to use the sample datasets:

# MongoDB
import pymongo
from pymongo import MongoClient
import pandas as pd
import json

# Pretty Printing - GH: EdwardBetts/pprint_color.py
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import Terminal256Formatter
from pprint import pformat

def cprint(obj):
    print(highlight(pformat(obj), PythonLexer(), Terminal256Formatter()))

print('Mongo version '+ pymongo.__version__)
client = MongoClient('localhost', 27017)
db = client.ryan
collection = db.restaurants

Mongo version 3.7.1

cursor = collection.find().sort('type_of_food',pymongo.DESCENDING).limit(3)
for doc in cursor:
    cprint(doc) # This prints the JSON entity.
    print()

Aggregation

# Up for breakfast?

pipeline = [
    {
        "$group":{
            "_id":"$type_of_food",
            "Average Rating":{"$avg":"$rating"},
            "Count":{"$sum":1}
        }
    },
    {
        "$sort":{
            # "Average Rating":-1,
            "Count": -1
        }
    }
]

agg = collection.aggregate(pipeline)
ratings = pd.DataFrame(list(agg))
ratings = ratings.set_index("_id")
ratings.head(10)

Average RatingCount
_id
Curry5.036158902
Pizza4.914141500
Chinese4.893678174
Kebab4.885621154
Fish & Chips5.036697116
American4.61702195
Turkish4.91891974
Lebanese4.80597070
Chicken4.41000053
Caribbean4.58333346
ratings.tail()

Average RatingCount
_id
Pasta6.01
Cakes5.51
Nigerian4.51
Punjabi6.01
Spanish4.51

Well, that’s pretty neat.

Another one:

pipeline = [
    {"$match": {"type_of_food":"Breakfast"}}
]

agg = collection.aggregate(pipeline)
breakfast = pd.DataFrame(list(agg))
breakfast

URL_idaddressaddress line 2nameoutcodepostcoderatingtype_of_food
0http://www.just-eat.co.uk/restaurants-133-take...55f14312c7447c3da7051b36133 Fullarton StreetIrvine133 TakeawayKA128DGNot yet ratedBreakfast
1http://www.just-eat.co.uk/restaurants-alfiesca...55f14312c7447c3da7051d66161 Helen StreetGlasgowAlfies Cafe - Collection OnlyG513HDNot yet ratedBreakfast
2http://www.just-eat.co.uk/restaurants-alldaybr...55f14312c7447c3da7051dc8227 Lower House LaneLiverpoolAll Day Breakfast BarL112SF4.5Breakfast
3http://www.just-eat.co.uk/restaurants-annescaf...55f14312c7447c3da7051f2764 Anne RoadSmethwickAnne's CafeB662NY5.5Breakfast
4http://www.just-eat.co.uk/restaurants-benvenut...55f14313c7447c3da70523b7Bell ParadeWest WickhamBenvenuti - Collection OnlyBR40RHNot yet ratedBreakfast
5http://www.just-eat.co.uk/restaurants-big-dadd...55f14313c7447c3da70524752 Norris StreetCheshireBig Daddy HouseWA27RLNot yet ratedBreakfast
6http://www.just-eat.co.uk/restaurants-bigphill...55f14313c7447c3da7052499Old Club BuildingTyne and WearBig Phil'lers DeliNE270EP5Breakfast

This Stack Overflow post should give you a good idea of how to import the json on Windows and nix.

Deep Learning on Azure

Microsoft now has free learning courses hosted on https://docs.microsoft.com/en-gb/learn/browse/?roles=developer

One course is focused on using jupyter notebooks and pytorch with azure containers. https://docs.microsoft.com/en-gb/learn/modules/interactive-deep-learning/index

Distribute your Python Software

PyInstaller?



Site Directory



Page Information

Title: Python
Word Count: 1743 words
Reading Time: 9 minutes
Permalink:
https://manuals.ryanfleck.ca/py/