10/15/2016

Bug in (datetime|time).strptime - AttributeError: _strptime

 Issue 7980


This bug occurs only when you use threads and only once.

Of course, this method is thread safe, but there is a severe warning. The first use of strptime is not thread secure (underneath _strptime is imported, and the import may throw AttributeError). If you want to avoid this problem, either you have to call strptime or import _strptime before starting a thread.

How to fix it?

Just import _strptime or call strptime before starting a thread.

It seems that it happens since 2010.

7/31/2016

Py.test - Splitting conftest file

If you have to maintain a massive project, you probably have many fixtures in conftest file. And there is a problem; this file grows and grows. So, at some point, you decide to split this huge file into smaller files. But, py.test has to know the fixtures are keeping in these files.

So, what you can do? I see three patterns here for this:
  • import these guys inside conftest.py file
  • create more conftest.py files
  • use pytest_plugins

Import fixtures inside conftest file


You can for example do this:
1
2
3
4
5
6
7
8
9
from tests.my_fixtures.fixs1 import (
    fix11, fix12, fix13, fix14, fix15, ...
)
from tests.my_fixtures.fixs2 import (
    fix21, fix22, fix23,
)
from tests.my_fixtures.fixs3 import (
    fix31, fix32, fix33
)

But, there are some problems with this approach. You import them in an explicit way, so if you create a new fixture, you have to remember to add this guy in the import clause. So, this strategy is not the best solution. If you have many files with fixtures, it might hurt you. Every developer will have to remember that he/she has to append this fixture here.

Hence, you can import all of them:
1
2
3
from tests.my_fixtures.fixs1 import *
from tests.my_fixtures.fixs2 import *
from tests.my_fixtures.fixs3 import *

But, I don't think it is a good practice. For example, you can encounter some problems with pyflakes.

Create more conftest files


You could also create more conftest files. A straightforward solution is to separate tests for example into unit/integrations/ui and keep conftest files there.

For example:
1
2
3
tests/ui/conftest.py
tests/unit/conftest.py
tests/integration/conftest.py

Ok, but what if these files are still large? You want to split them again... We've discovered recently that we can use pytest_plugins for this.

Use pytest_plugins


How py.test discovers plugins (here's more info)?
  1. by loading all built-in plugins
  2. by loading all plugins registered through setuptools entry point
  3. by pre-scanning the command line for the -p name option
  4. by loading all conftest.py files
  5. by recursively loading all plugins specified by the pytest_plugins variable in conftest.py files
Look at the fifth item; you can load your fixtures by the pytest_plugins variable. You just have to add the pytest_plugins variable to your conftest.py file.

Let's say you had many fixtures in UI tests (in a conftest.py file). You split them into smaller modules, and now you want to have access to all of them during tests.

Here's example:
1
2
3
4
5
6
7
8
pytest_plugins = [
    ...
    'tests.ui.fixtures.user_fixtures',
    'tests.ui.fixtures.admin_fixtures',
    'tests.ui.fixtures.template_fixtures',
    'tests.ui.fixtures.free_account_fixtures',
    ...
]

If you create a new module with new fixtures, you have to add this guy here, but I think it is not a problem.

6/27/2016

DynamoDB in pytest-dbfixtures

pytest-dbfixtures

If you use pytest maybe you have heard about pytest-dbfixtures:
Pytest dbfixtures is a pytest plugin that makes it a lot easier to set up proper database or storage engine for testing. Simply use one of provided fixtures that start predefined clean database server for your tests or creates server more tailored for your application by using one of provided factories.
This plugin is very useful if you have integration tests in your project, and you want to perform tests on a database for example. You will find information how to use it in the documentation.

Currently, the plugin supports:
  • Postgresql
  • MySQL
  • Redis
  • Mongo
  • Elasticsearch
  • RabbitMQ
And recently, we have added support for DynamoDB.

Here, you will find how to run DynamoDB on your computer. And, here we are. If you want to use it in production, you want to test it locally.

dynamodb fixture

If you still do not use pytest, go to the pytest page and read how to use it, now. If you want to understand the further content you have to know how pytest works.

The simplest way is to put dynamodb fixture as an argument in your test:
1
2
def test_dynamodb(dynamodb):
    ...

You will find an instance of the class DynamoDB.ServiceResource in dynamodb variable. The plugin assumes you have extracted DynamoDB files into /tmp/dynamodb. If you have not extracted the files into this directory, or you do not want to do this, you can create an individual fixture:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from pytest_dbfixtures import factories

my_dynamodb_proc = factories.dynamodb_proc(
    dynamodb_dir="/path/to/mydynamodb",
    port=14235,
    delay=True,
)
my_dynamodb = factories.dynamodb("my_dynamodb_proc")


def test_my_dynamodb(my_dynamodb):
    ...

As you can see, you may also pass port and delay as arguments. If you do not specify the port argument, the fixture will use a random port. If you want to introduce delays for certain operations in DynamoDB you can pass the argument delay as a True.
DynamoDB can perform some tasks almost instantaneously, such as create/update/delete operations on tables and indexes; however, the actual DynamoDB service requires more time for these tasks. Setting this parameter helps DynamoDB simulate the behavior of the Amazon DynamoDB web service more closely. (Currently, this parameter introduces delays only for global secondary indexes that are in either CREATING or DELETING status.)
So, let's say that you want to run DynamoDB on 9009 port with some delays, and you keep DynamoDB files in /tmp/tests/dynamodb. Then you want to create a table and put some data there.

Here is an example:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
from pytest_dbfixtures import factories

my_dynamodb_proc = factories.dynamodb_proc(
    dynamodb_dir="/tmp/tests/dynamodb",
    port=9009,
    delay=True,
)
my_dynamodb = factories.dynamodb("my_dynamodb_proc")


def test_my_dynamodb(my_dynamodb):
    table = my_dynamodb.create_table(
        TableName="Test",
        KeySchema=[
            {
                "AttributeName": "id",
                "KeyType": "HASH"
            }
        ],
        AttributeDefinitions=[
            {
                "AttributeName": "id",
                "AttributeType": "N"
            }

        ],
        ProvisionedThroughput={
            "ReadCapacityUnits": 10,
            "WriteCapacityUnits": 10
        }
    )

    table.put_item(
        Item={
            "id": 42,
            "my_data": "is secret"
        }
    )

    response = table.get_item(
        Key={
            "id": 42,
        }
    )

    assert response["Item"]["my_data"] == "is secret"

Because we want to test faster and faster, we run DynamoDB in memory. Remember that when you stop the process of DynamoDB, none of the data will be saved (DynamoDB does not write any database files at all). If you do not want to run tests in memory, your Pull Request is welcome.

6/07/2016

PyCharm - tips, tricks and plugins

Introduction

I have been using Vim for a long time, but recently I have started using PyCharm too. This tool has many proper features and plugins. I will show you some of them. You will find here many gifs instead of text. In this way, you can easily see how it works, how it looks, etc.

Plugins


 .ignore
This plugin was written by my friend. Here is more info about it. It will generate new ignore files for you.



BashSupport
Supports syntax highlighting, rename refactoring, documentation lookup, inspections, quickfixes, etc.


Dash
A smart and simple plugin that provides keyboard shortcut access for Dash.


KeyPromoter
Shows you how easy you can make the same action using only a keyboard.


Lua
Support for LUA.


Markdown
Support for markdown.


YAML/Ansible support
I don't want to write again "Support for YAML/Ansible".

Tips & Tricks


You do not know what to do? Do not be scared! Just use TODO.


You have access to Terminal and Python Console in an easy way. Of course, you can decide which shell do you want to use.



Version Control. Let's say you want to see what someone has changed in this file from commit A to commit B. You just see a history of git.


And now, you want to see all commits of a particular user.


You have reached a very odd file. You want to see who has made this, when and why.


You have a better idea to name a variable, so you want to change this quickly.


If you create a docstring PyCharm will fill the docstring with parameters.


Let's say that you have changed some lines of code, and you forgot what was there before. Consequently, you want to see these changes and restore to the former state if there was a mistake.


Your project is large and you do not want to remember where lives every piece of code. Too much waste of time. Hence, you want to find these objects very quickly. You can locate them by class name, file name, symbol name, etc.


You have the class A. Other classes inherit from the class A. In class A, you have a particular method, and you want to see how this method is implemented in other classes.


That's enough for me. I am not going to convince someone to use PyCharm; I just wanted show some features that I like very much in this editor. It is the best alternative for Vim (for me of course).

6/02/2016

AWS Lambda and Python

What is the AWS Lambda?

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running.
So, you can build a scalable application without managing servers. Something like a microservice without servers. Serverless!

Hence, your function must be written in a stateless style. If you want to store something somewhere, you can connect to S3, Redshift, DynamoDB, etc. AWS Lambda will start to execute your code within milliseconds.

I am not going to write a tutorial step by step. You will find here only a handful necessary information.

Limits

Let's look at the limits (all limits are default, you can ask guys from AWS to increase them).
  • You have access to an ephemeral disk with limit 512 MB (access to /tmp only).
  • Your function must be finished within 300 seconds (it is a default max value; if you want you can set 10 seconds also); if not AWS Lambda will terminate it.
  • Zip package with your function must be less than 50 MB.
  • Uncompressed zip must be less than 250 MB.
  • Memory; you can choose from 128 MB to 1536 MB (you choose more memory, you will pay more; pricing details)
Tip: remember to remove not used files from /tmp. AWS Lambda can spawn in "the same box" (if you call many of them in a short period), so you do not want to get the error: "Sorry. Not enough space".
The Lambda free tier does not automatically expire at the end of your 12 month AWS Free Tier term, but is available to both existing and new AWS customers indefinitely.
What does it mean? It means that you have 1M free requests per month and 400000 GB-seconds of compute time per month. It will reduce your costs a lot.

The flow

  1. Upload your code to AWS Lambda
  2. Set up your code (when to trigger your code, etc.)
  3. Something triggers your code
  4. You pay only for the compute time you use 

Asynchronous and Synchronous

AWS Lambda can work in two modes. In Asynchronous mode, AWS Lamba is triggered by events in AWS Services and a caller does not wait. In synchronous mode, you trigger the code, and you wait for the result.

Python

Which versions of Python are supported?
Lambda provides a Python 2.7-compatible runtime to execute your Lambda functions. Lambda will include the latest AWS SDK for Python (boto3) by default.
Can I use packages with AWS Lambda in Python?
Yes. You can use pip to install any python packages needed.

Programming Model

Handler - you have to specify a handler function. If something invokes your Lambda function, AWS Lambda will execute your code by calling the handler function.

def handler_function(event, context): 
    # event - some event data (usually dict)
    # context - runtime information 
    return 'ok'

The Context Object - you will find here some information such as function name, version, memory limit, etc.

Logging - you can use print or logging. Both variants will produce and put logs into CloudWatch, but with logging, you will have more information like timestamp and log level. For example, you can turn on DEBUG on dev, but WARN on production.

Exceptions - if your code raises an exception, AWS Lambda will notify you about it. In synchronous mode, AWS Lambda will return you a JSON with information about the exception. In asynchronous mode, you will find the exception in logs.

Versioning and Aliases

You do not want to put your code directly on production (I merely assume), so you need versioning. It works very smoothly:

Versioning and Aliases

Versions are immutable (you cannot change it), but aliases are mutable. You can think of an alias as a pointer to a particular version. So, your PROD alias can point at Version 1 (stable), but DEV alias can point at Version 5 (in progress). Remember, you cannot change a version; you have to create a new one. But, you can change an alias.

Upload code to AWS Lambda

Here is a list of packages which make it easier to deploy/update for your function:
I have been using only one of them: lambda-uploader. I have not had any problems with this package.

Asynchronous mode

Let's say that you have a bucket (S3) where clients can put files, and you want to perform some operations after someone has uploaded a file. You can use AWS Lambda for this. You have to implement a logic in your package and upload it on AWS Lambda. After that, you should create an alias and add event source for this function. You want to call your function when someone uploads a new file to a bucket so that it might look like:

Add event source

Next, if someone put a file in a bucket, AWS Lambda will invoke the function and your code will do something! In monitoring tab you can see stats from the last 24 hours (in CloudWatch you have more data of course):

Monitoring

You can create alarms for errors, so if something bad happens, you can intervene.

If you want, you can also use AWS Lambda as a cron. You have to set this in event source:


Synchronous mode

And now, you want to call Lambda function and get a result. Accordingly, you can use synchronous invoke. In this way, you will receive a response if Lambda completes executing (or you will receive JSON with an error). You can use API Gateway and Lambda for this. Here is a framework in BETA stage (2016/05/31) if you are interested.

Here you will find instruction how to create an API. If you need a tutorial (API Gateway + Lambda), I have found two which are interesting:
You can also use events from Kinesis, CloudWatch, DynamoDB and SNS to call AWS Lambda.


AWS Lambda is an excellent strategy if you want to perform a specific action from time to time. You have to remember that Lambda is stateless. Therefore, you have to keep information about a state somewhere else (if you need). You do not have to pay for a machine which has to operate all month. Your code runs? You will pay.

Sources: