Below is a sample script to automatically generate descriptions based on page body text. It is for Plone CMS, but should be applicable to any Python based CMS with some modifications.
The idea is that we take three first sentences and use them as a description.
Use case: People are lazy to write descriptions (descriptions as in Dublin Core metadata). You can generate some kind of description by taking the few first sentences of the text. This is not perfect, but this is way better than empty description. Also, the script comes with good comments which should be helpful for beginner Plone programmers.
Please comment if you have other simple ideas to generate descriptions.
Usage
- Add Script (Python) item through Zope Management interface to any Plone folder
- Put in the code payload below
- Hit Test tab or type in Script URL manually – note that the operation is one shot only
- The script iterates through all content items in that folder
- The script will provide logging output to standard Plone log (var/log and stdout if Plone is run in the debug mode).
Since Zope uses RestrictedPython for through-the-web created scripts, the user of this script cannot breach the server security (they cannot make Python calls they have no permission for). This sets some limitations for automating tasks like this, but we don’t hit those limitations in our use case.
def create_automatic_description(content, text_field_name="text"):
""" Creates an automatic description from HTML body by taking three first sentences.
Takes the body text
@param content: Any Plone contentish item (they all have description)
@param text_field_name: Which schema field is used to supply the body text (may very depending on the content type)
"""
# Body is Archetype "text" field in schema by default.
# Accessor can take the desired format as a mimetype parameter.
# The line below should trigger conversion from text/html -> text/plain automatically using portal_transforms
field = content.Schema()[text_field_name]
# Returns a Python method which you can call to get field's
# for a certain content type. This is also security aware
# and does not breach field-level security provded by Archetypes
accessor = field.getAccessor(content)
# body is UTF-8
body = accessor(mimetype="text/plain")
# Now let's take three first sentences or the whole content of body
sentences = body.split(".")
if len(sentences) > 3:
intro = ".".join(sentences[0:3])
intro += "." # Don't forget closing the last sentence
else:
# Body text is shorter than 3 sentences
intro = body
content.setDescription(intro)
# context is the reference of the folder where this script is run
for id, item in context.contentItems():
# Iterate through all content items (this ignores Zope objects like this script itself)
# Use RestrictedPython safe logging.
# plone_log() method is permission aware and available on any contentish object
# so we can safely use it from through-the-web scripts
context.plone_log("Fixing:" + id)
# Check that the description has never been saved (None)
# or it is empty, so we do not override a description someone has
# set before automatically or manually
desc = context.Description() # All Archetypes accessor method, returns UTF-8 encoded string
if desc is None or desc.strip() == "":
# We use the HTML of field called "text" to generate the description
create_automatic_description(item, "text")
# This will be printed in the browser when the script completes succesfully
return "OK"
Subscribe to RSS feed Follow me on Twitter
Follow me on Facebook Follow me Google+