Resurrecting Skype4Py

Posted on 2013-01-30 by Mikko Ohtamaa

Today I had a pleasure to maintain my first Skype4Py release. The power of open source: code will never die.

Skype4Py is a Python library for controlling a Skype desktop application enabling all sorts of automation for Skype chats and calls. It’s originally work of Arkadiusz Wahlig from 2009.

The code is old and has not seen active development for years. But it is in perfectly good shape following the best practices of Python development. The code just works, regardless of those multiple Skype releases the world has seen since the last decade. The library was missing only few patches regarding the changes in modern operating systems and minor bugs.

Also, this project is 100% free, unlike Skype’s own SDK (registration + 5 USD fee needed last time I checked).

What actually was missing was a community maintained release process. So, Arkadiusz moved code to a Github where I could migrate the patches from earlier code forks. Then I updated the codebase to follow zest.releaser release automation best practices, so now making new releases is just typing a single line of command. Also, Github makes the merging of patches much much more easier thanks to its Pull request feature.

I did this mainly because my own project, Sevabot, a Skype bot supporting integration with external services, depends on Skype4Py and having real PyPi package releases of Skype4Py makes the life of Sevabot users easier. In turn, I use Sevabot internally in my projects (issue tracker, version control monitoring integration), so it basically makes my own life a joy. I have outsourced my work to a Skype bot, even better than outsourcing it to low paid Asian developers.

There are still some outstanding (Windows) issues in Skype4Py. I trust now when people see active development around the project there are enough Microsoft oriented hands and eyeballs to get those issues fixed.

$\"\"$ Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

Script for generating Google documents from Google spreadsheet data source

Posted on 2013-01-21 by Mikko Ohtamaa

Table Of Content

1. Ingredients of the document generator

2. Short introduction to Google Apps Script

3. The generator script

4. Translations

Microsoft Office supports “data sources” to generate e.g. letters, invoices, address stickers and other repeating documents based on Microsoft Word template and Microsoft Excel data. This is a very common small business problem and Office has had a solution for it from mid-90s. Google Apps, the cloud based alternative to Microsoft Office, do not offer similar functionality natively (or at least if they do they hide it really well). However, you can quite easily create your own document generator using Google Apps scripting as long as your are proficient in programming. In this blog post I’ll show an example how to create such a script and learn the basics of Google Apps Scripts.

1. Ingredients of the document generator

We have following the following inputs for our business problem

A Google Apps spreadsheet which contains customer data.
A Google Apps docs template document. Based on it, we want to generate a document for each customer by filling this template document with the data from the spreadsheet.
A Google Drive folder where the resulting documents are stored.
A Google Apps script which automatizes the task for us (Javascript based)

In my case the use case was generating contract texts for the customers based on their price and quality of service data. Then I just exported and emailed the resulting Google Docs as PDF.

All of these are stored on your Google Apps account in Google Drive. All editing happens through Google Apps user interface, no external tools needed.

Example of source data (obfuscated with obfuscate.js)

Example of the template document (obfuscated with obfuscate.js). You can see the source labels, unfilled.

Example of the resulting document – labels filled in and no longer in bold (obfuscated with obfuscate.js)

2. Short introduction to Google Apps Script

Google Apps Scripts can be invoked in two ways

From Google Driver user interface (Add new… more… Script)
Directly from Google Apps spreadsheet (there is a menu entry)
And more
There is also integrated public snippet sharing service in Google Spreadsheet… though UI really leaves room for improvement here. You can also sell scripts in Google Chrome Web Store.

Because we are not working on the spreadsheet we need to use the former approach.

Google Apps Script is a JavaScript (ECMAScript version unspecified? Does it run V8?) cloud scripting language that provides easy ways to automate tasks across Google products and third party services. The Google Apps Script has extensive API documentation with examples and tutorials, but they are still much subject to change as almost everything is marked as experimental and already there exist a lot of deprecated methods. The Google Apps Scripts can also access Google Maps, contacts, email, sites, Google Apps domain setting and basically have automation solution almost everything you can do in Google cloud.

The script is executed on the server-side and you have a non-fancy localized browser based UI to edit and debug your script.

The philosophy and UI design patters feel like a step back to 90s, to the Visual Basic scripting environment. Maybe Google Apps developers wanted this… so that Visual Basic developers feel back home. However, coming from a web development, Javascript and general programming background you will find the lack of Firebug / Web Inspector like console disturbing. It does not feel like any other Javascript development, though certainly the syntax is the same.

So my minor complains include, but are not limited to

Logging from the applications is possible, but the log trace is very unreadable in UI
The program does not have a specific entry point, you need to choose a function using a selection widget. This makes the script feel like a toy.
Debugger (and lack of console) does not seem to allow you to modify and dynamically poke objects in run-time (call functions, etc.)
Debugger is a bit slow (round-trip to Google servers, a bit), though still pretty much useable
Lack of low end user interaction tools in standalone scripting (please see below)
API documents and reality did not always match (as everything is still experimental)

Debugger in action

Things could be better, but in the end I managed to get done what I was looking for and I am still not paying a penny for Google Apps, so I am happy. Also, I do not wish to go back to Microsoft Office unless I need to write well-formatted print documents… Google Docs is a toy what comes to heavy and graphically sensitive document authoring like offers…. Or presentations… where Keynote is the king.

3. The generator script

In the beginning of the script you have constants which define on which data to operate. You could build an user interface making the script to full web application, but this is too cumbersome approach for such a small task. The UI builder seemed nice, but definitely an overkill. Though there exist Google Apps Script API methods for performing simple prompt() question in the browser, for some reason they were not supported in standalone scripting… so the fastest approach to enter data into the script was simply edit the script itself before each run. I sooo started to miss command line… first time in my life.

So, in the beginning of the script you define the source data

Spreadsheet id (you can pick it up from URL when you edit the document)
Template document id (you can pick it up from URL when you edit the document)
Customer id which is the spreadsheet row number, for the current script run
The Google Driver folder id where the resulting document will be placed for sharing. Again you can pick the id from URL when opening the folder.

Then the script simply replaces words with data. The keyword to be replaced in the template document are identified as the column labels (1st row) in the spreadsheet data. I am pretty sure there would be more efficient methods to do this, but I did not wish to spend time to go to knee deep to GS to figure out its nuances.

And then the script… please feel free to modify to your own needs (generator.gs):

/**
 * Generate Google Docs based on a template document and data incoming from a Google Spreadsheet
 *
 * License: MIT
 *
 * Copyright 2013 Mikko Ohtamaa, http://opensourcehacker.com
 */

// Row number from where to fill in the data (starts as 1 = first row)
var CUSTOMER_ID = 1;

// Google Doc id from the document template
// (Get ids from the URL)
var SOURCE_TEMPLATE = "xxx";

// In which spreadsheet we have all the customer data
var CUSTOMER_SPREADSHEET = "yyy";

// In which Google Drive we toss the target documents
var TARGET_FOLDER = "zzz";

/**
 * Return spreadsheet row content as JS array.
 *
 * Note: We assume the row ends when we encounter
 * the first empty cell. This might not be 
 * sometimes the desired behavior.
 *
 * Rows start at 1, not zero based!!! 🙁
 *
 */
function getRowAsArray(sheet, row) {
  var dataRange = sheet.getRange(row, 1, 1, 99);
  var data = dataRange.getValues();
  var columns = [];

  for (i in data) {
    var row = data[i];

    Logger.log("Got row", row);

    for(var l=0; l<99; l++) {
        var col = row[l];
        // First empty column interrupts
        if(!col) {
            break;
        }

        columns.push(col);
    }
  }

  return columns;
}

/**
 * Duplicates a Google Apps doc
 *
 * @return a new document with a given name from the orignal
 */
function createDuplicateDocument(sourceId, name) {
    var source = DocsList.getFileById(sourceId);
    var newFile = source.makeCopy(name);

    var targetFolder = DocsList.getFolderById(TARGET_FOLDER);
    newFile.addToFolder(targetFolder);

    return DocumentApp.openById(newFile.getId());
}

/**
 * Search a paragraph in the document and replaces it with the generated text 
 */
function replaceParagraph(doc, keyword, newText) {
  var ps = doc.getParagraphs();
  for(var i=0; i<ps.length; i++) {
    var p = ps[i];
    var text = p.getText();

    if(text.indexOf(keyword) >= 0) {
      p.setText(newText);
      p.setBold(false);
    }
  } 
}

/**
 * Script entry point
 */
function generateCustomerContract() {

  var data = SpreadsheetApp.openById(CUSTOMER_SPREADSHEET);

  // XXX: Cannot be accessed when run in the script editor?
  // WHYYYYYYYYY? Asking one number, too complex?
  //var CUSTOMER_ID = Browser.inputBox("Enter customer number in the spreadsheet", Browser.Buttons.OK_CANCEL);
  if(!CUSTOMER_ID) {
      return; 
  }

  // Fetch variable names
  // they are column names in the spreadsheet
  var sheet = data.getSheets()[0];
  var columns = getRowAsArray(sheet, 1);

  Logger.log("Processing columns:" + columns);

  var customerData = getRowAsArray(sheet, CUSTOMER_ID);  
  Logger.log("Processing data:" + customerData);

  // Assume first column holds the name of the customer
  var customerName = customerData[0];

  var target = createDuplicateDocument(SOURCE_TEMPLATE, customerName + " agreement");

  Logger.log("Created new document:" + target.getId());

  for(var i=0; i<columns.length; i++) {
      var key = columns[i] + ":"; 
      // We don't replace the whole text, but leave the template text as a label
      var text = customerData[i] || ""; // No Javascript undefined
      var value = key + " " + text;
      replaceParagraph(target, key, value);
  }

}

4. Translations

Please see Czech translation by Alex Bojik from Bizow.com.

Please see Polish by Valeria Aleksandrova.

$\"\"$ Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

Obfuscate.js: clean sensitive information out from web page screenshots

Posted on 2013-01-15 by Mikko Ohtamaa

Obfuscate.js is a Javascript console tool which obfuscates the text on the web page hiding sensitive information. The purpose of this tool is to be able to make shareable screenshots for collaborative debugging, examples, demos and so on. You no longer need to manually blur parts of the screen shot in Photoshop, GIMP, etc. image manipulation program, saving your precious time and making sharing screenshots easier.

You invoke Obfuscate.js script from the Javascript console of your web browser
Obfuscate.js obfuscates the whole page or parts of it
Take a screenshot
Post to a public forum, blog or other media

I hacked it together in a day for my own purposes.

1. Demostration

Obfuscate.js can walk through the whole page (<body>) or arbitary page bits chosen by CSS selectors.

Open the page with sensitive information in your web browser.

Open Javascript console. Copy-paste line of text from obfuscate.min.js to your Javascript console.

In this point you have the orignal page still with sensitive information

After this you can obfuscate a part of a page by writing the command in the console targeting a specific CSS selector:

obfuscate("#waffle-grid-container"); // Obfuscate contents of Google Spreadsheet

Or simply obfuscate all text the whole page:

obfuscate(); // Obfuscate all the text on the page

2. More information

Go to Obfuscate.js Github project page.

$\"\"$ Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

Exporting Plone content as JSON

Posted on 2013-01-04 by Mikko Ohtamaa

Below is a simple Python script for exporting data from Plone CMS as JSON.

Use cases

Migrating to other CMSs
Downloading data dumps to play with
Making content items available to be played around within Javascript

The script handles

Nested folders
Binary data
Can be executed from the command-line or from a web browser

Pros

Much simpler and lightweight anything involving collective.transmogrify tool. Time to write this script was around 1,5 hours.

The script is also on developer.plone.org. See the orignal blog post for syntax highlighting. Use with love and care or with booz and whiskey.

"""

    Export folder contents as JSON.

    Can be run as a browser view or command line script.

"""

import os
import base64

try:
    import json
except ImportError:
    # Python 2.54 / Plone 3.3 use simplejson
    # version > 2.3 < 3.0
    import simplejson as json

from Products.Five.browser import BrowserView
from Products.CMFCore.interfaces import IFolderish
from DateTime import DateTime

#: Private attributes we add to the export list
EXPORT_ATTRIBUTES = ["portal_type", "id"]

#: Do we dump out binary data... default we do, but can be controlled with env var
EXPORT_BINARY = os.getenv("EXPORT_BINARY", None)
if EXPORT_BINARY:
    EXPORT_BINARY = EXPORT_BINARY == "true"
else:
    EXPORT_BINARY = True

class ExportFolderAsJSON(BrowserView):
    """
    Exports the current context folder Archetypes as JSON.

    Returns downloadable JSON from the data.
    """

    def convert(self, value):
        """
        Convert value to more JSON friendly format.
        """
        if isinstance(value, DateTime):
            # Zope DateTime
            # http://pypi.python.org/pypi/DateTime/3.0.2
            return value.ISO8601()
        elif hasattr(value, "isBinary") and value.isBinary():

            if not EXPORT_BINARY:
                return None

            # Archetypes FileField and ImageField payloads
            # are binary as OFS.Image.File object
            data = getattr(value.data, "data", None)
            if not data:
                return None
            return base64.b64encode(data)
        else:
            # Passthrough
            return value

    def grabArchetypesData(self, obj):
        """
        Export Archetypes schemad data as dictionary object.

        Binary fields are encoded as BASE64.
        """
        data = {}
        for field in obj.Schema().fields():
            name = field.getName()
            value = field.getRaw(obj)
            print "%s" % (value.__class__)

            data[name] = self.convert(value)
        return data

    def grabAttributes(self, obj):
        data = {}
        for key in EXPORT_ATTRIBUTES:
            data[key] = self.convert(getattr(obj, key, None))
        return data

    def export(self, folder, recursive=False):
        """
        Export content items.

        Possible to do recursively nesting into the children.

        :return: list of dictionaries
        """

        array = []
        for obj in folder.listFolderContents():
            data = self.grabArchetypesData(obj)
            data.update(self.grabAttributes(obj))

            if recursive:
                if IFolderish.providedBy(obj):
                    data["children"] = self.export(obj, True)

            array.append(data)

        return array

    def __call__(self):
        """
        """
        folder = self.context.aq_inner
        data = self.export(folder)
        pretty = json.dumps(data, sort_keys=True, indent='    ')
        self.request.response.setHeader("Content-type", "application/json")
        return pretty

def spoof_request(app):
    """
    http://developer.plone.org/misc/commandline.html
    """
    from AccessControl.SecurityManagement import newSecurityManager
    from AccessControl.SecurityManager import setSecurityPolicy
    from Products.CMFCore.tests.base.security import PermissiveSecurityPolicy, OmnipotentUser
    _policy = PermissiveSecurityPolicy()
    setSecurityPolicy(_policy)
    newSecurityManager(None, OmnipotentUser().__of__(app.acl_users))
    return app

def run_export_as_script(path):
    """ Command line helper function.

    Using from the command line::

        bin/instance script export.py yoursiteid/path/to/folder

    If you have a lot of binary data (images) you probably want

        bin/instance script export.py yoursiteid/path/to/folder > yourdata.json

    ... to prevent your terminal being flooded with base64.

    Or just pure data, no binary::

        EXPORT_BINARY=false bin/instance run export.py yoursiteid/path/to/folder

    :param path: Full ZODB path to the folder
    """
    global app

    secure_aware_app = spoof_request(app)
    folder = secure_aware_app.unrestrictedTraverse(path)
    view = ExportFolderAsJSON(folder, None)
    data = view.export(folder, recursive=True)
    # Pretty pony is prettttyyyyy
    pretty = json.dumps(data, sort_keys=True, indent='    ')
    print pretty

# Detect if run as a bin/instance run script
if "app" in globals():
    run_export_as_script(sys.argv[1])

$\"\"$ Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

Open Source Hacker

Pushing the boundaries of free technology

Monthly Archives: 2013-1

Resurrecting Skype4Py

Script for generating Google documents from Google spreadsheet data source

Table Of Content

1. Ingredients of the document generator

2. Short introduction to Google Apps Script

3. The generator script

4. Translations

Obfuscate.js: clean sensitive information out from web page screenshots

1. Demostration

2. More information

Exporting Plone content as JSON