Monday, February 15, 2010
Semantic Web: an alternative for RDFa
I believe that a better alternative in a HTML5 world is to keep RDF separate from web pages but have a clear set of rules for finding RDF data files that correspond to web pages (either static or generated). One rule might be to look for a file named index.rdf for top level domain URLs; for example, see if http://markwatson.com/index.rdf exists for http://markwatson.com. For a URL like http://markwatson.com/hobbies look for http://markwatson.com/hobbies.rdf or http://markwatson.com/hobbies/index.rdf.
Although CMS support (e.g., Drupal) for RDFa and helper libraries like the RDFa Rails plugin might make it fairly easy for some web sites to provide RDFa, I think that we need something simpler that might be adopted by more web sites.
I am writing an open source tool (that will be an example program in the Semantic Web book I am writing) that will generate RDF data from web pages. I'll post a link when the code is ready.
Labels: semantic web
Thursday, January 21, 2010
The beauty of Latex: my AllegroGraph book becomes two books, one for JVM languages and one for Lisp
I have been working on and off for 16 months on a book about Semantic Web (or Linked Data) application programming using the AllegroGraph product. I have decided to substantially increase the scope of this applications/tutorial style book to also include support for Sesame. The figure on the left shows the software architecture road map for the book using JVM languages.I am splitting the book into two volumes, and using Latex makes this really easy to share small amounts of common material so both books stand on their own. Latex also makes it easy to combine both books into one all-inclusive book, eliminating the duplicated parts. The two volumes are:
- Volume I: will cover the use of both AllegroGraph and Sesame using JVM languages: Java, Scala, JRuby, and Clojure. I am working on a common wrapper written in Java that supplies my own (rather simple) API to both AllegroGraph and Sesame. My wrapper implements Sesame support for geolocation and free text indexing and search so the wrapper is adequate to run all of the book examples using either AllegroGraph or Sesame "back ends."
- Volume II: will cover only AllegroGraph using both the embedded and client Lisp APIs.
Labels: Clojure, Java, JRuby, Latex, Lisp, RDF, Scala, semantic web
Monday, July 20, 2009
Book project, Google Wave, and a kayaking video
I received a Google Wave Sandbox invitation today. I am going to try to spend an hour or two a day with Wave to get up to speed. Fortunately, I am 100% up to speed using the Java AppEngine (initially, Wave Robots, etc. get hosted on AppEngine, either Java or Python versions) and I have some experience with GWT - so I should already be in good shape -- but I need to write some code :-)
My wife took a short video of me kayaking yesterday.
Labels: AllegroGraph, Google, Sedona, semantic web, Wave
Wednesday, July 08, 2009
Continuing to work on my AllegroGraph book
I don't think that the market will be large for an AllegroGraph (AG) book, but after using AG on one customer project and experimenting (off and on) with it for several years, I decided that it was Semantic Web technology worth mastering. AG is a commercial product, but a free server version (supports Lisp, Ruby, Java, and Python clients) is available that is limited to 50 million RDF triples (a large limit, so many projects can simply use the free version).
AG supports the Sesame (an open source Java RDF data store) REST style APIs so if you stick with SPARQL and only RDFS reasoning, you get portability to also use a BSD licensed alternative. That said, my reason for using AG is all of the proprietary extra goodies!
In addition to a few Lisp, Python, Ruby, and Java client examples, I am going to incorporate a lot of useful Common Lisp utilities for information processing that I have been working on for many years: this will motivate me to package up a great deal of my Common Lisp code and release it with an open source license. I plan on releasing the book for free as a PDF file and as a physical book for people who want to purchase it. The book and the open source examples should be available before the end of this year.
Labels: Lisp, semantic web
Wednesday, February 04, 2009
Web 3.0: not just Semantic Web and Linked Data, also interop on languages and platforms
A big part of a shift towards a value/production based Web 3.0 that combines material for human readers and linked business software systems is the reduction of cost through open source software. It is clear that when using and building highly distributed systems on the web platform that we need to take advantage of multiple platforms (Java, Ruby/Rails, PHP, Pyhton/Django, etc.) I noticed that IBM is releasing a new version of Project Zero that provides an integrated Java and PHP deployment platform. My personal platforms of choice are Rails and Server side Java so I prefer Sun's Glashfish/JRuby/Rails/Java bundle.
The point that I am making is that platform choice is often guided by what combination of major open source web application frameworks best fit our business needs. A secondary concern is how we merge and integrate applications like (for example) PHP based SugarCRM, Java Business Intelligence stacks, and custom Ruby on Rails applications.
The final piece of the "great frugality" is learning to live with and accept open source licenses like the GPL and AGPL that to a large degree forces the sharing of infrastructure software. This can be an expensive mistake: failure to take advantage of cost reduction from open software infrastructure, while gaining either competitive advantages or at least efficiency and profitability due to business processes and knowledge.
Labels: semantic web, web applications
Wednesday, December 03, 2008
Good article on adding security to Semantic Web applications
Labels: semantic web
Monday, November 24, 2008
Something fun: new book project on the Semantic Web using AllegroGraph
This book is fairly easy for me to write because I have existing coding experiments for just about all the Semantic Web application examples in the book. Also, since there are so many good Semantic Web references on the web and in existing books, I am only covering the SW technology that is used in the book examples. I want the book to be self contained: just enough tutorial and reference material covering AllegroGraph and other SW technologies so readers can completely understand the application examples.
Labels: semantic web
Monday, October 06, 2008
Swi-Prolog and the Semantic Web
I noticed (see linked PDF paper) this morning that the RDFizing and Interlinking the EuroStat Data Set Effort (riese) architecture (diagram) uses Swi-Prolog on the back end. Very cool. The riese web site itself is interesting: human readable web pages with embedded RDFa for semantic web software agents. (Make sure you view page source on your browser.)
Labels: Prolog, semantic web
Monday, September 15, 2008
Distributed robust system for provenance and trust in Semantic Web Applications and Tim Berners-Lee's new World Wide Web Foundation
First, I want to describe the problem to be solved: assuming the existence of RDF/RDFS/OWL data on the web, how do you know what is correct and what is faked for whatever nefarious reasons? What is the provenance of the data? Even human readers have a difficult time separating out real information from rumors, errors, and outright lies on the web.
Proposed solution: organizations "sign" data with a certificate for either a fee or other motivation. Using the current technology, RDF triples would be reified with one or more "trust tokens" (also implemented as RDF) from known signers who vouch for the provenance and accuracy of data. For now, this rating would have to be performed by human analysts, but could hopefully be done quickly and not too expensively with something like Amazon's Mechanical Turk system. I don't see this trust measurement as a Boolean trust or no-trust value - rather, a numeric range. Further: known signers can rate other signers. Signers would have a trust score. Accuracy and provenance of data could thus be assigned trust score based on the trust ratings given by one or more signers and the trust score of the signers themselves. The problem is to make this process of assignment a small fraction of the cost of producing RDF/RDFSOWL knowledge sources while adding significant extra value.
There is a lot of literature; try searching for "web of trust semantic web" and "provenance semantic web". When I read about Tim Berners-Lee's new World Wide Web Foundation this morning I started to hope that they might develop some open and free infrastructure software to support trust annotation of data. The high economic cost of quality trust-rated RDF/RDFS/OWL knowledge sources is definitely a problem, but it is difficult to even imagine the possible range of financial and social benefits. Having standard open source software to manage trust would help reduce costs for providing trust and provenance data through a network of cooperating trust providers.
Labels: semantic web
Thursday, July 24, 2008
Dynamic language 'goodness': comparing JRuby and Java Semantic Web example programs
Labels: Java, RDF, Ruby, semantic web
Saturday, May 17, 2008
Book review: "Semantic Web for the Working Ontologist"
There are a few tiny annoyances with this book, the primary one being small errors in the text that should have been caught in technical review. These do not however detract at all from the usefulness of the book - it is just too bad that such a very well thought out book has easily fixed mistakes.
For me one of the potential uses of this book is to loan it to or recommend it to customers who might want or need to use Semantic Web technology: I make my living as a consultant and it is important to have well informed customers and this book will provide a good understanding and rational for technically inclined customers, especially people with strong domain knowledge who want to (and can) directly participate in modeling efforts.
Labels: semantic web
Saturday, May 05, 2007
Interesting technology: AllegroGraph
The thing that I find interesting about using AllegroGraph is that you are dealing with disk-based persistent data, but not dealing with objects - not dealing with object relational mapping, etc. Instead, you work with graph data structures that are stored on disk, with parts cached in memory. Interesting stuff.
Still, dealing with RDF is not optimal, compared to dealing with graphs in memory. As an example: I used to work a lot with Rete networks using Lisp (hacking Charles Forgy's Lisp code) and dealing with graph data structures built up with Lisp lists, cons, etc. is just easier to do. In memory graphs, semantic networks, etc. are just easier for me to wrap my thoughts around. However, approaches like AllegroGraph have the advantage of scalability.
Labels: RDF, semantic web
Tuesday, April 17, 2007
The Semantic Web, Parrots, and AI
Our small parrot must have some abstract world model of objects and his own body. Why and how he thought of raising one shoulder while lowering the other to compress the width of his shoulders is a mystery to me, but I believe that this was possibly an example of abstract thinking.
Labels: AI, semantic web
Subscribe to Posts [Atom]
