We have a nice Christmas present for you: Lily 1.1 is out, and there's improvements for everyone: developers, administrators and Lily hackers. Read more about the exciting new stuff in Lily 1.1 below!
Lily adds a high-level data model on top of HBase. Originally, the model was a simple list of fields stored within records, but we added some field types making that model a whole lot more interesting. A first addition is the RECORD value type. You can now store records inside records, which is useful to store structured data in fields. For indexing purposes, you can address sub-record data as if it are linked records, using dereferencing.
Two other cool new value types are LIST and PATH, which allow for far more flexible modeling than the previous multi-value and hierarchy field properties. At the schema level, we adopted a generics style of defining value types, for instance LIST<LIST<STRING>> defines a field that will contain a list of lists of strings. Finally, we also added a BYTEARRAY value type for raw data storage.
If you're familiar with multi-user environments you sure know about the problem of concurrent updates. For these situations, Lily now provides a lock-free, optimistic concurrency control feature we call conditional updates. The update and delete methods allow one to add a list of mutation conditions that need to be satisfied before the the update or delete will be applied.
For concurrency control, you can require that the value of a field needs to be the same as when the record was read before the update.
Lily 1.1 ships with a toolchest for Java developers that want to run unit tests against an HBase/Lily application stack. The stack can be launched embedded or externally, with simple scripts straight out of the Lily distribution. You can also request a 'state reset', clearing a single node instance of Lily for subsequent test runs. Yes, you can now run Lily, HBase, Zookeeper, HDFS, Map/Reduce and Solr in a single VM, with a single command.
For the fearless Lily repository hacker, we offer two hooks to expand functionality of the Lily server process. There's decorators which can intercept any CRUD operation for pre- or post-execution of side-effect operations (like modifying a field value before actually committing it).
The global rowlog queue is now distributed across a pre-split table, with inserts and deletes going to several region servers. This will lead to superior performance on write-or update-heavy multi-node cluster setups.
Our first customers (*waves to our French friends*) found our API to be a tad too verbose and suggested a Builder pattern approach. We listened and unveil a totally new (but optional) method-chaining Builder API for the Java API users.
For Lily Enterprise customers, we rewrote our cluster installer using Apache Whirr, being one of the first serious adopters of this exciting Cloud- and cluster management tool. Using this, installing Lily on many nodes becomes a breeze. Here's a short movie showing off the new installer.
Thanks to better parallelization, Lily has become considerably faster. You can now comfortably throw more clients at one Lily cluster and see combined throughput scale fast.
All in all, Lily 1.1 was a great release to prepare. We hope you have as much fun using Lily 1.1 as we had building it. Check it out here: www.lilyproject.org.
(press release)
Outerthought and Oxynade, two software companies from the Belgian Ghent area, are collaborating in the context of the TWIRL project, a European research project on open platforms able to process, mine, interlink and fuse data originating from real world applications and on-line data sources. The research groups IBBT/Ghent University (IBCN) and Sirris join their effort.
The research project sits at the root of new developments in Lily, the Big Data content repository from Outerthought, that will combine storage, indexing search with profile management, analytics and content recommendations in its next versions. The Lily platform is based on Apache Hadoop and HBase and is the world's first NOSQL/Big Data content repository.
Hadoop and HBase are being used in large organizations such as TomTom, Netlog, Twitter, Facebook and Yahoo!, but the technology is still regarded as complex and finicky to use. Lily makes this technology easy to adopt and use for every organization with large-scale data management needs.
"Research without field validation has no purpose", says Steven Noels, Outerthought CEO, "so we are very happy to be able to collaborate with Oxynade on the TWIRL project. They provide us with a lot of practical experience and knowledge about event recommendations, and they possess an immense set of practical trial data. Also, due to their international growth ambitions, they are in need of a data platform that can scale widely, which means Lily is a great fit for them."
The TWIRL project also got the EU ITEA label, which guarantees high-quality industry research.
Outerthought and Oxynade receive a Flemish research grant for their collaboration to the amount of 550.000 EUR for a 100 manmonth project. The tangible collaboration plans between the industry group and Sirris and IBBT as research partners had a positive impact on the grant approval.
Contact:
Steven Noels - Outerthought - +32 9 338 82 20
Niko Nelissen - Oxynade - +32 9 233 40 09
As mentioned in our previous newsletter, this Summer is Hot in Lily-land. We have some more exciting announcements to make:
We're very happy and proud to be included in the initial list of Cloudera ConnectTM Partners, announced just a couple of days ago. Lily is also being prominently featured in a Solution Spotlight, explaining the value-add of Lily on top of the Cloudera Distribution of Hadoop. We're excited to work together with Cloudera on making Big Data and HBase easier to use for enterprise developers in retail, media and news.
If you want to learn more about how Lily fits well with Cloudera, a solution brief is available as well (application/pdf, 160.9 kB, info).

Also, we're participating with the Accenture Innovation Awards, a yearly contest for novel business and technology ideas.
This Summer, we’re hard at work on some exciting new features of Lily, while helping some customers to kick off their new Lily-based projects. It’s great to see our product being used in practice, a truly educational experience for ourselves as well. Let’s have a look at the new stuff.
1. Lily Test Framework
We’ve always said Lily is about developer conveniencing: making the hard bits easy. At the core of Lily, we’re solving the really hard problem of consistent index maintenance in between HBase and Solr. On the outside however, we also wanted to make it easy for enterprise devs to Get Things Done with Lily - as if it is just another database component they already know.
That means it should be easy to call Lily from inside unit tests, and that you don’t need half-a-cluster per developer to just use Lily and its API to program against. We wanted something that allows you to launch Lily and all of its constituents with a single call, embedable, “laptop-class”-compatible (i.e. not relying on virtualization tools to launch pre-made Linux images with Lily, just to run Lily on a developer workstation).
So we went off and created a Lily test framework, that launches Lily, HBase, Hadoop, Solr and friends quickly, easily and efficiently, with sane single-node preconfigurations and some hooks to cycle data initialization as well. It’s available from trunk since a couple of days. Some people will be really happy to learn that this Lily test framework supports Windows (except currently for map/reduce operations) as well. The Lily Test Framework allows you to call Lily from inside unit tests, either standalone or embedded. The standalone mode can also be used to quickly set up a single-node, single-process instance of Lily to play around with.
Adding onto that, there’s a Maven goal available now to quickly set up a fresh, empty Lily project.
All this might sound pretty mundane, and to some extend it is, but it's one of the babysteps we see necessary to emphasize that we're really serious about bringing Lily to the enterprise, not requiring PhD-level hackers to get started with Big Data.
2. Lily cluster
installs
For Lily Enterprise customers, we’re currently preparing a Whirr-based cluster installation that supports both cloud-based installs (e.g. Amazon EC2) and "bring your own nodes" installations - on your own cluster. Whirr is an Apache project under incubation for installing, setting up and running cloud services in a platform-neutral way. We’ve been happily contributing a slew of patches and issue reports (221, 338, 339, 240 and 342) while working on Lily support for Whirr.
Using Whirr and our Lily Enterprise installation packages, setting up complex multi-node installs of Lily will be a breeze.
3. The Lily Adoption Program
As mentioned during our well-attended first Lily webinar, we've set up a Lily adoption roadmap helping enterprises to easily discover, explore, adopt, deploy and support Lily inside their organization. It's a multi-step, facilitated process with workshops, proof-of-concept support, training and implementation assistance based on our many years of project experience. The 2-day workshop is still available at a discounted rate (-20%) until end of September, so don't hesitate to get in touch with us.
It’s a long (temperature-wise not so hot unfortunately) Summer over here in Lily-land, and we’re making great progress whipping up some exciting new features. Stay tuned for more!
The AppsForGhent hacking event and contest last Saturday was a success on the many scales you can measure. For me the most important one in any case was the participation both in quality and quantity of the various teams, kuddos to all participating and giving it their best shot.
Also nice to see is that more and more stuff (code and ideas) is being shared out there. Here is our presentation by the way:
I'm looking forward to seeing the teams (and the city) showing a prolonged commitment in the 2nd challenge of the event: creating a working application during the next two weeks and will be keeping an eye on twitter @AppsForGhent to track progress.