Links

[https://realpython.com/]

Python tips

sample python program

def main(argv):
    print("hello")

if __name__ == "__main__":
    main(sys.argv[1:])

interactive shell help

Use help() and dir() in python interactive shell to list class and method docs

import pandas as pd
# list class doc
help(pd)
# list class methods
dir(pd)
# list class method usage and doc
dir(pd.read_csv)

argparser

[https://mkaz.blog/code/python-argparse-cookbook/]

import argparse
arser = argparse.ArgumentParser()
# action="store_true" or action="store"
parser.add_argument("-f", "--function" help="set output width", action="store", dest="function_name", default=???, type=??)
parser.add_argument('count', action="store", type=int)

Use pyenv to work with python

Easient way to work with different python versions is using pyenv and pyenv-virtualenv Firslty pyenv and pyenv-virutalenv modules needs to be installed

Install pyenv

[https://github.com/pyenv/pyenv]

install pyenv-virtualenv

Note:: installing pyenv will also install pyenv-virtualenv If you have installed pyenv, check if virtual-env is already installed in this path $(pyenv root)/plugins/pyenv-virtualenv

add the following line in bashrc


echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile

[https://github.com/pyenv/pyenv-virtualenv]

Alternatively check https://realpython.com/intro-to-pyenv/ for installation

Common build errors

https://stackoverflow.com/questions/52873193/error-the-python-ssl-extension-was-not-compiled-missing-the-openssl-lib-inst

On Debian stretch (and Ubuntu bionic), libssl-dev is OpenSSL 1.1.x, 
but support for that was only added in Python 2.7.13, 3.5.3 and 3.6.0. To install earlier versions, 
you need to replace libssl-dev with libssl1.0-dev. This is being tracked in https://github.com/pyenv/pyenv/issues/945.

So if you don't need a specific version of 2.7 you can go ahead and install 2.7.13 and the error will not appear

FAQ Commands

pyenv install --list | grep " 3\.[678]"
pyenv install --list | grep "jython"
pyenv install -v 3.7.2

# ls ~/.pyenv/versions/
pyenv uninstall 2.7.15

pyenv versions

which python
pyenv which python

# If, for example, you wanted to use version 2.7.15, then you can use the global command
pyenv global 2.7.15

# If you ever want to go back to the system version of Python as the default, you can run this:
pyenv global system

pyenv commands

# The local command is often used to set an application-specific Python version
pyenv local 2.7.15

# The shell command is used to set a shell-specific Python version
pyenv shell 3.8-dev
pyenv virtualenv <python_version> <environment_name>

pyenv virtualenv 3.6.8 myproject

# Activating Your Versions
pyenv local myproject

pyenv deactivate

Running the script via crontab using pyenv

[http://blog.rubypdf.com/2018/10/12/how-to-set-virtualenv-for-a-crontab/]

cat « EOF » /root/docker_data/scripts/nsedatadownloader/runpy #!/bin/bash
source /root/.pyenv/versions/3.5.4/envs/3_5_4/bin/activate

#virtualenv is now active, which means your PATH has been modified. #Don’t try to run python from /usr/bin/python, just run “python” and #let the PATH figure out which version to run (based on what your #virtualenv has configured).

python “$@”

#another way #echo ‘source /Users/steven/.pyenv/versions/3.6.4/envs/ts/bin/activate; python /Users/steven/tmp/hello.py’ | /bin/bash EOF

Pandas

[https://honingds.com/blog/pandas-read_csv/]

fequently used pandas commands

import pandas as pd
filepath = 'test.csv'
pd_data = pd.read_csv(filepath)
list(pd_data.columns)
df  = pd.read_csv(file, usecols = ["SYMBOL \n", "PREV. CLOSE \n"])
df = pd.read_csv("f500.csv", usecols = lambda column : column not in ["company" , "rank", "revenues"])
pd.read_csv(filepath, skiprows=50, nrows=5)
# set data type for the csv column. use thousands=, if the float value has thousand separator
df = pd.read_csv(filepath, dtype = {"PREV. CLOSE \n" : "float64"}, thousands=',')
df.info()
# https://www.interviewqs.com/ddi_code_snippets/get_list_pandas_dataframe

Looping through csv and fetching values

for i, row in input_file_pd_data_n_rows.iterrows():
    symbol = row['SYMBOL']
    analyse_file_pd_data = pd.read_csv(str(path_in_str))
    analyse_file_row = analyse_file_pd_data.loc[analyse_file_pd_data['SYMBOL'] == symbol ]
    analyse_file_ltp = analyse_file_row['LATEST']
    analyse_file_ltp_value = analyse_file_ltp.iloc[0]

Create dataframe and print in tabular format

data = { 'Col1' : [val1, val2], 
'Col2' : [val3, val4] }
df = pd.DataFrame(data, columns = ['Col2', 'Col1']
print(df)

Just few tips

sort files based on timestamp

files = list(filter(os.path.isfile, glob.glob(args.dir_path + "/" + args.file_prefix + '*.csv')))
files.sort(key=lambda x: os.path.getmtime(x))