Site News
All things Plone UI
Sprint
During the Plone conference I put on a UI sprint. We had about a dozen participants and it was vastly successful in getting people into the fold and recruiting people to help out on the Plone UI team.
Unified reference browser widget/contents browser
Roché Compaan, Dylan Jay and Tom Gross(and maybe others?) worked creating a unified content browser widget. Initially, they were just going to repurpose archetypes.referencebrowserwidget to work with formlib and z3cform. Hopefully, after getting direction from Alexander Limi, it was decided we needed an overarching content browser widget that could be used for folder contents, field widgets and anything related to browser for content on the site.
After a lot of discussion and planning, they started working on a contents browser prototype that'll we'll get testing for later.
Improved Content Rules UI
Thomas Desvenain and Vilmos Somogyi worked on implementing the https://dev.plone.org/ticket/13152 ticket. They did a lot of great work and am almost done implementing the ticket.
Deco
We also had a meeting with Alex Limi getting everyone on the same page with Deco. Rok is about to make a new release soon but the UI will evolve into something quite a bit different yet as how the UI functions is still yet not where we want it and we haven't integrated much of the CMSUI features.
Misc
- plone.app.controlpanel template fixes(Thijs Jonkman)
- UI Guidelines(me)
- Deco prototype, more on this later
For all of those that I have forgotten to mention, my apologies.
UI Team
I've started the UI team back up again. Some of the team's initial goals are:
- providing feedback for plips and implementations
- guiding Deco's UI
- revamping existing Plone UI
- be able to give definitive UI guidelines to plone developers
We've had our first meeting and started to get things kicked up. We have a great group of people and I'm excited to see what we get done.
Deco
The UI team has decided it is very important to get user feedback early on the Deco UI. One of the things I worked on was a new deco prototype, mostly a combination of Rob's work and Rok's toolbar.
UI Testing
We're looking to kick off a more formal UI testing process for any new components entering Plone. The most obvious candidates being Deco and the new contents browser widget being developed.
Long term, we'd like to have UI testing guidelines and a framework for allowing any number of organizations to facilitate UI testing to help Plone get feedback on new features.
A better folder contents implementation
We've found that many of our clients have trouble with the default drag and drop plone offers. Additionally, it's often asked for a multi-file uploader.
wildcard.foldercontents is the packing of various different features:
- Better drag and drop based off of jQuery UI sortable
- uploadify multi-file uploads based off of collective.uploadify
- automatic folder sorting--idea stolen from collective.sortmyfolder
Here's a brief choppy demonstration of how it works:
New Plone collections and it how it might affect you
Shiny new collections
The new collections provide a much nicer query UI with live results. Gier Baekholt has a short video online showing them off: http://blip.tv/eric-steele/geir-baekholt-plone-app-collection-3386446
Upgrading Plone
When upgrading your plone site, the old collections will still be available to you only they're be label "Collection (old-style)." Old collections will NOT be migrated to new-style collections.
Add-on product compatibility
Most add-ons right now that use collections for their functionality are not currently compatible.
Enabling old-style collections
If you're starting a new Plone site from scratch, the old collections will not be enabled by default and you may still want to use them on your site--especially if you're running add-ons that depend on the old-style collections yet.
To manual enable old-style collections, follow these steps:
- Visit the ZMI(or append /manage onto the url of your plone site)
- Click "portal_types"
- Click "Topic (Collection (old-style))"
- Check the "Implicitly addable?"
- Click the "Save" button
Developing for old and new collections
New style collections still implement the queryCatalog method which results the results from the catalog query so most likely the only thing you'll need to change is interface registrations and references to portal_type.
I have just updated collective.plonetruegallery for the new collections so I'll share some tips on integrating.
Conditional ZCML
In order to be backward compatible, you should use conditional zcml for any registrations or code that needs to be loaded. The collective docs has a good section on how to do this.
A simple example in practice is:
<browser:page zcml:condition="installed plone.app.collection" name="myview" for="plone.app.collection.interfaces.ICollection" class=".views.MyView" permission="zope2.View"/>
Registering an interface for new collection
<class class="plone.app.collection.collection.Collection"
zcml:condition="installed plone.app.collection">
<implements interface=".interfaces.IGallery" />
</class>
Retrieve the raw query
from plone.app.querystring import queryparser
query = queryparser.parseFormquery(collectionobj, collectionobj.getRawQuery())
Document Viewer Integration in Plone
collective.documentviewer integrates the great New York Times Document Viewer into Plone.
Features
- OCR
- Searchable on OCR text
- works with many different document types
- plone.app.async integration with task monitor
- configuration options
- PDF Group view for display groups of PDFs
Installation
There is an extensive set of system installation requirements that you must install in order for document viewer to work correctly. Additionally, it is recommended that you install and setup plone.app.async along with this package.
How it works
The docsplit tool is used to generate images and text files for each PDF. The viewer is simply just a viewer of images and text files so it's easy to style and customize. The downside of this is that, for every PDF page, 4 files are generated. For sites with a lot of large PDFs, even with blog storage, it's a lot of extra data the zodb has to manage. That is why basic file storage is also available.
The OCR text is also indexed locally with the PDF(using repoze.catalog) and globally with the plone catalog. This is done because a custom index is required for document viewer in order to search text in the PDF.
Configuration Options
After product activation, there will be a control panel item, "Document Viewer Settings."
- Image sizes -- Customize the size of images generated for the viewer
- Storage type -- Allows you to setup file storage for your PDF data
- OCR -- by default, this is off because it can take quite some time to OCR documents(and with no plone.app.async installed, some users could end up being very unhappy)
- Detect Text -- detect if text is already found on PDF, if so, do not OCR
- Auto select layout -- automatically, for PDF and any enabled document types, select the document viewer layout
- Auto layout type -- the types of files that should also be enabled for the document viewer layout.
A Tour
With screenshots, I will go over the various features.
Settings
Make sure to customize any settings before you start using it. Also, make sure to activate any office formats you'd like to be able to use:

The Viewer



Group View
A view is also added for folders and collections to display groups of PDFs and search within the groups.

Converting Office Documents
Office documents are also then able to be converted for the viewer.

Async Integration
How plone.app.async integration is managed.
You can view the current status of your conversion async task by clicking the "Document Viewer Convert" button at any time. Or, if it isn't converting, you can reinitiate conversion there.

You can also monitor all tasks currently in the queue:

What's left
- It's not internationalized at all. Apologies to non-english plone users.
- Better mobile viewer
- If you're converting a lot of documents at once with plone.app.async, there seems to be a issue with conflict errors. Unfortunately, this could cause your document to be converted more than once..
collective.plonetruegallery 2.x demonstration
It's been a long time since I've made a blog post and this still I thought I'd try a webcast. Well, not really a webcast but a video demonstration of some of collective.plonetruegallery's features. collective.plonetruegallery was one of my first projects for plone and has matured over the years now with lots of help from the community.
New Features
- New gallery display type integrations(supports 8 total now)
- galleria
- nivo slider
- nivo gallery
- pikachoose
- s3slider
- Better inline gallery support
- Gallery portlet can now show full gallery(useful when using in conjunction with Content Well Portlets)
- Products.Collage support
Documentation for installation and installing different display types can be found on pypi and plone.org.
The music choice doesn't quite fit for the video. Sorry folks, I didn't know what else would be appropriate. It's better than silence :)
High Availability Varnish Configuration for Plone
Why
There are many reasons why a backend server could go down or be unresponsivw and there is no reason that your caching proxy can't serve out stale content while it is down or slow to respond.
How
There are a few tricks that will help you get better performance out of varnish and that will trick varnish into serving stale content instead of an error.
Serving Stale Content
Restart the request and have varnish use an always down server on error so that it'll serve stale content right away
- Setup the fake backend
... backend failapp { .host = "127.0.0.1"; .port = "9999"; .probe = { .url = "/hello/"; .interval = 12h; .timeout = 1s; .window = 1; .threshold = 1; } } ... - Set the grace period on the request in vcl_recv
... if (!req.backend.healthy) { set req.grace = 1d; } else { set req.grace = 15m; } ... - Set grace period for response in vcl_fetch
... set beresp.grace = 10d; ...
- Set a marker error header in the vcl_error section and restart the request
... sub vcl_error { /* set a marker on so we know there is an error with the backends and that we should serve out stale content */ if ( req.http.X-Varnish-Error != "1" && req.request != "PURGE" && req.restarts == 0) { set req.http.X-Varnish-Error = "1"; return (restart); } } ... - Check for the marker error header in the vcl_recv and set to already down backend
... if (req.http.X-Varnish-Error == "1") { set req.backend = failapp; unset req.http.X-Varnish-Error; } else { set req.backend = plone; } ...
Cleaning Up The URL
There is no need to cache the different hash urls(#) or different query parameters for google analytics
...
if (req.url ~ "\#") {
set req.url=regsub(req.url,"\#.*$","");
}
# Strip out Google related parameters
if(req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=") {
set req.url=regsuball(req.url,"&(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)","");
set req.url=regsuball(req.url,"\?(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)","?");
set req.url=regsub(req.url,"\?&","?");
set req.url=regsub(req.url,"\?$","");
}
...
Full Example Configuration
In this configuration, keep some things in mind:
- The configuration is manually setting the cache age on these objects and relying more on purges to handle cache refreshes
- The configuration assumes the public site is not for logging in, so no cookie handling is happening
- The configuration sets additional response headers so you can see information on how varnish handled the response(ttl, grace, status, hit)
- This exact configuration is cleaned up from what I actually use in production and you'll need to clean it up and implement your own parts of it to an extent. Please don't assume that this is just a drop in replacement.
acl purge {
"localhost";
"127.0.0.1"; /* and everyone on the local network */
"10.10.10.10";
}
/* failapp is used to help trick varnish into using stale content */
backend failapp {
.host = "127.0.0.1";
.port = "9999";
.probe = {
.url = "/hello/";
.interval = 12h;
.timeout = 1s;
.window = 1;
.threshold = 1;
}
}
backend cms1 {
.host = "10.10.10.1";
.port = "8080";
.connect_timeout = 10s;
.max_connections = 30;
.first_byte_timeout = 300s;
.probe = {
.url = "/";
.interval = 3s;
.timeout = 3s;
.window = 5;
.threshold = 2;
.initial = 1;
}
}
backend cms2 {
.host = "10.10.10.1";
.port = "8081";
.connect_timeout = 10s;
.max_connections = 30;
.first_byte_timeout = 300s;
.probe = {
.url = "/";
.interval = 3s;
.timeout = 3s;
.window = 5;
.threshold = 2;
.initial = 1;
}
}
backend cms3 {
.host = "10.10.10.1";
.port = "8082";
.connect_timeout = 10s;
.max_connections = 30;
.first_byte_timeout = 300s;
.probe = {
.url = "/";
.interval = 3s;
.timeout = 3s;
.window = 5;
.threshold = 2;
.initial = 1;
}
}
backend cms4 {
.host = "10.10.10.1";
.port = "8083";
.connect_timeout = 10s;
.max_connections = 30;
.first_byte_timeout = 300s;
.probe = {
.url = "/";
.interval = 3s;
.timeout = 3s;
.window = 5;
.threshold = 2;
.initial = 1;
}
}
director plone round-robin {
{ .backend = cms1; }
{ .backend = cms2; }
{ .backend = cms3; }
{ .backend = cms4; }
}
sub vcl_recv {
if (req.http.X-Varnish-Error == "1") {
set req.backend = failapp;
unset req.http.X-Varnish-Error;
} else {
set req.backend = plone;
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE" &&
req.request != "PURGE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD" && req.request != "PURGE") {
/* We only deal with GET and HEAD by default */
return (pass);
}
/* Time to mess with the request */
unset req.http.Cookie;
unset req.http.User-Agent;
unset req.http.Accept-Charset;
if (req.http.Accept-Encoding) {
if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|pdf|headerImage)$") {
# No point in compressing these
remove req.http.Accept-Encoding;
} elsif (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} elsif (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
} else {
# unkown algorithm
remove req.http.Accept-Encoding;
}
}
# Strip hash, server doesn't need it.
if (req.url ~ "\#") {
set req.url=regsub(req.url,"\#.*$","");
}
# Strip out Google related parameters
if(req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=") {
set req.url=regsuball(req.url,"&(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)","");
set req.url=regsuball(req.url,"\?(utm_source|utm_medium|utm_campaign|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)","?");
set req.url=regsub(req.url,"\?&","?");
set req.url=regsub(req.url,"\?$","");
}
/* End modifying the request */
if (req.request == "PURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
return(lookup);
}
/* grace and saint related settings.
To ensure to always serve static content. */
if (!req.backend.healthy) {
set req.grace = 1d;
} else {
set req.grace = 15m;
}
/* end saint/grace mode stuff */
return (lookup);
}
sub vcl_error {
/* set a marker on so we know there is an error with the backends
and that we should serve out stale content */
if ( req.http.X-Varnish-Error != "1" && req.request != "PURGE" && req.restarts == 0) {
set req.http.X-Varnish-Error = "1";
return (restart);
}
}
sub vcl_hash {
set req.hash += req.url;
if (req.http.Accept-Encoding) { set req.hash += req.http.Accept-Encoding; }
return (hash);
}
sub vcl_fetch {
unset beresp.http.set-cookie;
if (beresp.status == 500) {
set beresp.saintmode = 5s;
set req.http.X-Varnish-Error = "1";
return (restart);
}
/* override ttls */
if(beresp.status == 301 || beresp.status == 302){
/* all redirects can be cached for a long time. Granted we always have invalidation. */
set beresp.ttl = 5h;
} else if(req.url ~ ".*portal_css.+cachekey.*\.(css|js)$") {
/* generated css/js files should be cached for a LONG time. All unique urls. */
set beresp.ttl = 10d;
} else if (req.url ~ "(\.jpg|\.png|\.gif|\.gz|\.tgz|\.bz2|\.tbz|\.mp3|\.ogg|\.pdf|\.css|\.js|/image_(large|preview|mini|thumb|tile))$") {
/* all file type resources can be cached for an hour */
set beresp.ttl = 1h;
}else{
/* everything else */
set beresp.ttl = 30m; /* how long should varnish cache it? */
}
set beresp.grace = 10d; /* The max amount of time to keep object in cache */
set beresp.http.X-Varnish-beresp-ttl = beresp.ttl;
set beresp.http.X-Varnish-beresp-grace = beresp.grace;
set beresp.http.X-Varnish-beresp-status = beresp.status;
}
sub vcl_hit {
if (req.request == "PURGE") {
set obj.ttl = 0s;
error 200 "Purged.";
}
}
sub vcl_miss {
if (req.request == "PURGE") {
error 404 "Not in cache.";
}
}
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
Additional Tips
- Varnish doesn't have nice error messages, so use nginx to override 500 errors to your liking if, for some reason, there is an error on a resource that was not in the stale cache.
- Varnish's cache is NOT persistent(although, varnish 3.0 is supposed to be) so if you restart your varnish process, you'll lose your long term cache.
- Also, you're limited by the size of your size. If you have a large site, make sure that you set the varnish file cache size to something very large so that you're able to utilize the use of stale content.
Plone 3.3.5 on Mac OS X Lion
Getting Python 2.4
First off, make sure you have a version of python 2.4 installed on the system. If you use the one located in the svn collective, it has a few patches that make it work correctly with Lion.
svn co http://svn.plone.org/svn/collective/buildout/python/
cd python
python bootstrap.py
./bin/buildout
Then use that python with your buildout.
Beware of collective.xdv
I didn't have enough time to figure out why, but xdv was making my instance crash on startup with no explanation. I did get plone to startup by upgrading to the latest version of xdv and collective.xdv but it would still crash when rendering a page for me. For now, I've just disable xdv on Lion and at least for a working plone 3 dev machine.
Small post but I just wanted to put it up in case someone else was experiencing the same problems.
Fixing Broken ZODB Object references
Introduction
If you start seeing POSKeyErrors on certain object, it most likely means your database is in some form of inconsistency. The problem is very well described by Elizabeth Leddy on her blog here. Her blog didn't quite handle the case that I encountered, missing objects--no oid in ZODB.
Getting Started
Run fsrefs.py to test your database and have it tell you which objects are bad.
python /path/to/eggs/ZODB/scripts/fsrefs.py /path/to/zodb/Data.fs
Will yield results like:
oid 0x959755L BTrees.OOBTree.OOBucket last updated: 2011-04-15 13:31:28.380634, tid=0x38DA88B79173877L refers to invalid object: oid 0x0135ca66 missing: '' oid 0x135CA59L Products.ATContentTypes.content.document.ATDocument last updated: 2011-04-11 22:21:16.544874, tid=0x38D941D46976A11L refers to invalid objects: oid 0x0135ca65 missing: '' oid 0x0135ca5c missing: '' oid 0x135CA6AL BTrees.OOBTree.OOBTree last updated: 2011-04-11 22:16:14.294142, tid=0x38D94183CFD03CCL refers to invalid object: oid 0x0135ca6b missing: ''
Testing Out The Bad Object
from ZODB.utils import p64
from persistent import Persistent
obj = app._p_jar[p64(0x959755L)] obj
Should give the error:
2011-05-24 09:23:31 ERROR ZODB.Connection Couldn't load state for 0x0135ca59
Traceback (most recent call last):
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/Connection.py", line 811, in setstate
self._setstate(obj)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/Connection.py", line 870, in _setstate
self._reader.setGhostState(obj, p)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/serialize.py", line 604, in setGhostState
state = self.getState(pickle)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/serialize.py", line 597, in getState
return unpickler.load()
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/serialize.py", line 471, in _persistent_load
return self.load_oid(reference)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/serialize.py", line 537, in load_oid
return self._conn.get(oid)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZODB/Connection.py", line 244, in get
p, serial = self._storage.load(oid, self._version)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZEO/ClientStorage.py", line 712, in load
return self.loadEx(oid, version)[:2]
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZEO/ClientStorage.py", line 735, in loadEx
data, tid, ver = self._server.loadEx(oid, version)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZEO/ServerStub.py", line 196, in loadEx
return self.rpc.call("loadEx", oid, version)
File "/opt/Zope/buildout-cache/eggs/ZODB3-3.8.4wc1-py2.4-linux-x86_64.egg/ZEO/zrpc/connection.py", line 699, in call
raise inst # error raised by server
POSKeyError: 0x0135ca65
Notes on a More Secure Plone Deployment
Read-only Public Site
Making your public site read-only will prevent even a compromised site from taking any damage--even if a malicious user does somehow gain access, they can't save any different data to the database.
There are a few ways to do this:
- Zope Replication Services(ZRS) allow you replicate a read-write backend private server to a read-only public facing site
- You can also use RelStorage for you zeoserver. Then use the replication facilities provided by some RDMSs to replicate to a read-only zeoserver on the public site.
- It is also possible to have read-only zeo clients connected to a read-write zeo server.
- zeoraid might even be an option(never tried it)
One thing to note is that there are some cases where Plone will try to write on read unfortunately. To get around this, I create a before commit event handler in a policy product to abort every transaction when the server is read-only. It's kind of hackish but a necessary evil to prevent a user from getting a nasty ReadOnly database error thrown at them. It would look something like:
from zope.component import adapterfrom ZPublisher.interfaces import IPubBeforeCommitimport App.configimport transactionconfiguration = App.config.getConfiguration()readonly = configuration.read_only_database@adapter(IPubBeforeCommit)def abortTransactionOnReadOnly(event):if readonly:transaction.abort()
Rewrite Login URLs
You can also rewrite login urls on the public site to restrict anyone from seeing a login form. Just do normal rewrites at your proxy server.
Urls you'll want to rewrite are:
- /manage
- /login
- /logged_out
- /require_login
- /acl_users
This will prevent anyone from seeing a login form and an unauthorized page.
Using Plone as a Document Repository
Update
It is recommended that you do not use this method anymore. Please use collective.documentviewer now which should cover all the use cases.
We just released a new site that houses thousands of scanned PDF documents that are now viewable in the browser via Flex Paper. We started with PDFs that were just scanned images. Plone, with the help of a few packages, then OCR'd and replaced the PDF with a searchable PDF counterpart.
Features
- Convert Image PDFs to searchable versions
- Split large PDFs into multiple documents
- Overwrite metadata of PDF
- OCR text is then searchable via Plone search
- Online viewable version
- All document processing is done via asynchronous processes so adding documents is not slow
- Can monitor conversion asynchronous processes
Requirements
- wc.pageturner : For online viewable PDFs
- wildcard.pdfpal : heavy lifting in PDF processing
- plone.app.async : asynchronously process PDF documents
- Tesseract > 3.0.1 system package
- swftools system package
- ghostscript system package
- hocr2pdf system package
- pdftk system package
- tiff2pdf system package
Caveats
- Probably only works in Linux
- wildcard.pdfpal is pretty specific and isn't smart at if it should process the PDF. For instance, if the PDF is already searchable, it'll still try to convert it regardless.
- We're not really interested in wildly supporting pdfpal beyond our use case(that's why it's not listed on plone.org, but in the collective and on pypi). So if you're interested in implementing this, you might end up contributing to the project and cleaning up some of the cruft in the package.
