Skip to main content

5 posts tagged with "python"

View All Tags

· One min read

Get started

Requires Python 3.6+, BeautifulSoup4 and urllib3.

Install by using pip

pip3 install pyseoanalyzer

Command-line Usage

If you run without a sitemap it will start crawling at the homepage.


Or you can specify the path to a sitemap to seed the urls to scan list.

seoanalyze --sitemap

HTML output can be generated from the analysis instead of json.

seoanalyze --sitemap --output-format html

Run From cloned Repository

python -f html > results.html

· One min read
  1. Let’s start by installing the python3-venv package that provides the venv module.
sudo apt install python3-venv
  1. Once the module is installed we are ready to create virtual environments for Python 3.
  2. Switch to the directory where you would like to store your Python 3 virtual environments.
cd /path/to/your/project
  1. Within the directory run the following command to create your new virtual environment.
python3 -m venv venv
  1. The command above creates a directory called venv, which contains a copy of the Python binary, the Pip package manager, the standard Python library and other supporting files.
  2. To start using this virtual environment, you need to activate it by running the activate script.
source venv/bin/activate

· 2 min read

Complete table showing the various information that the logging system can output:

%(name)sLogger's name
%(levelno)sLog level in digital form
%(levelname)sLog level in text form
%(pathname)sThe full path name of the module that called the log output function, may not have
%(filename)sThe file name of the module that called the log output function
%(module)sCall the module name of the log output function
%(funcName)sCall the function name of the log output function
%(lineno)dThe line of code where the statement that called the log output function is located
%(created)fCurrent time, expressed in floating point numbers representing the time of the UNIX standard
%(relativeCreated)dThe number of milliseconds since the Logger was created when the log information was output.
%(asctime)sThe current time in the form of a string. The default format is "2003-07-08 16:49:45,896". Behind the comma is the millisecond
%(thread)dThread ID. Maybe not
%(threadName)sThread name. Maybe not
%(process)dProcess ID. Maybe not
%(message)sUser output message

· 4 min read

Trying to understand what is refactoring, follow the guide from Real Python.

1. Functions That Should Be Objects

Without reading the code, you will not know if it will modify the original image or create a new image."
def load_image(path):
with open(path, "rb") as file:
fb = file.load()
image = img_lib.parse(fb)
return image

def crop_image(image, width, height):
return image

def get_image_thumbnail(image, resolution=100):
return image

To call the codes:

from imagelib import load_image, crop_image, get_image_thumbnail

image = load_image('~/face.jpg')
image = crop_image(image, 400, 500)
thumb = get_image_thumbnail(image)

Symptoms of code using functions that could be refactored into classes:

Similar arguments across functions
Higher number of Halstead h2 unique operands (All the variables and constants are considered operands)
Mix of mutable and immutable functions
Functions spread across multiple Python files

Here is a refactored version of those 3 functions, where the following happens:

.__init__() replaces load_image().
crop() becomes a class method.
get_image_thumbnail() becomes a property.

The thumbnail resolution has become a class property, so it can be changed globally or on that particular instance:
class Image(object):
thumbnail_resolution = 100

def __init__(self, path):

def crop(self, width, height):

def thumbnail(self):
return thumb

This is how the refactored example would look:

from imagelib import Image

image = Image('~/face.jpg')
image.crop(400, 500)
thumb = image.thumbnail

In the resulting code, we have solved the original problems:

It is clear that thumbnail returns a thumbnail since it is a property, and that it doesn’t modify the instance.
The code no longer requires creating new variables for the crop operation.

2. Objects That Should Be Functions

Here are some tell-tale signs of incorrect use of classes:

Classes with 1 method (other than .__init__())
Classes that contain only static methods
authentication class
class Authenticator(object):
def __init__(self, username, password):
self.username = username
self.password = password

def authenticate(self):
# Do something
return result

It would make more sense to just have a simple function named authenticate() that takes username and password as arguments:
def authenticate(username, password):
# Do something
return result

3. Converting “Triangular” Code to Flat Code

These are the symptoms of highly nested code:

A high cyclomatic complexity because of the number of code branches
A low Maintainability Index because of the high cyclomatic complexity relative to the number of lines of code
def contains_errors(data):
if isinstance(data, list):
for item in data:
if isinstance(item, str):
if item == "error":
return True
return False

Refactor this function by “returning early”

def contains_errors(data):
if not isinstance(data, list):
return False
return data.count("error") > 0

Another technique to reduce nesting by list comprehension

Common practise to create list, loop through and check for criteria.

results = []
for item in iterable:
if item == match:

Replace with:

results = [item for item in iterable if item == match]

4. Handling Complex Dictionaries With Query Tools

It does have one major side-effect: when dictionaries are highly nested, the code that queries them becomes nested too.

data = {
"network": {
"lines": [
"name.en": "Ginza",
"": "銀座線",
"color": "orange",
"number": 3,
"sign": "G"
"name.en": "Marunouchi",
"": "丸ノ内線",
"color": "red",
"number": 4,
"sign": "M"

If you wanted to get the line that matched a certain number, this could be achieved in a small function:

def find_line_by_number(data, number):
matches = [line for line in data if line['number'] == number]
if len(matches) > 0:
return matches[0]
raise ValueError(f"Line {number} does not exist.")
find_line_by_number(data["network"]["lines"], 3)

There are third party tools for querying dictionaries in Python. Some of the most popular are JMESPath, glom, asq, and flupy.

import jmespath"network.lines", data)
[{'name.en': 'Ginza', '': '銀座線',
'color': 'orange', 'number': 3, 'sign': 'G'},
{'name.en': 'Marunouchi', '': '丸ノ内線',
'color': 'red', 'number': 4, 'sign': 'M'}]

If you wanted to get the line number for every line, you could do this:

> > >"network.lines[*].number", data)
> > > [3, 4]

You can provide more complex queries, like a == or <. The syntax is a little unusual for Python developers, so keep the documentation handy for reference.

Find the line with the number 3

> > >"network.lines[?number==`3`]", data)
> > > [{'name.en': 'Ginza', '': '銀座線', 'color': 'orange', 'number': 3, 'sign': 'G'}]

· 2 min read

Find package

py -m pip freeze | findstr py

Upgrade Pip

python -m pip install --upgrade pip --trusted-host --trusted-host
py -m pip install --upgrade pip --trusted-host --trusted-host
C:\Users\xxx\AppData\Local\Programs\Python\Python38\python.exe -m pip install --upgrade pip --trusted-host --trusted-host

Install Pip

py --trusted-host --trusted-host

Pip Install command

pip install --trusted-host --trusted-host <package> --user
py -m pip install --trusted-host --trusted-host <package> --user

some notes on new setup

python -m pip install --trusted-host --trusted-host <package> --user

install from whl

pip install some-package.whl

Add to path


Use this to install from local drive

pip install ./downloads/PythonSetup3.8/flake8-3.8.3.tar.gz  --trusted-host --trusted-host
py -m pip install ./downloads/PythonSetup3.8/dateformat-0.9.7.tar.gz --trusted-host --trusted-host --user

Install requirement

pip install -r requirements.txt --trusted-host --trusted-host
python -m pip install -r requirements.txt --trusted-host --trusted-host

When upgrading, reinstall all packages even if they are already up-to-date.

pip install --upgrade --force-reinstall <package> --trusted-host --trusted-host
pip install --upgrade --force-reinstall --trusted-host --trusted-host <package>

pip install -I <package>
pip install --ignore-installed <package>

Once in a while, a Python package gets corrupted on your machine and you need to force pip to reinstall it. As of pip 10.0, you can run the following:

This will force pip to re-install corrupted package and all its dependencies.

pip install --force-reinstall <corrupted package>

If you want to re-download the packages instead of using the files from your pip cache, add the --no-cache-dir flag:

pip install --force-reinstall --no-cache-dir <corrupted package>

If you want to upgrade the package, you can run this instead:

pip install --upgrade <corrupted package>

The --upgrade flag will not mess with the dependencies of corrupted package unless you add the --force-reinstall flag.

If, for some reason, you want to re-install corrupted package and all its dependencies without first removing the current versions, you can run:

pip install --ignore-installed <corrupted package>

By the way, if you’re using a pip version that is less than 10.0, it’s time to update pip:

pip install --upgrade pip