Danny Ayers : Raw Blog

Hixie's Furniture

Too long; read later - here's a demo : SPARQL Sliders Test

+Ian Hickson posted a lovely semweb use case:

"I'd like a search tool for furniture that works like Google's Flight Search does for flights. That is, with sliders so I can say what type of furniture (table), what range of widths (1-2m), lengths (2-5m), and heights (1-2m), what material (wood), what thickness, what price range, etc, I'd like, with the list of available products updating in real time."

As it happens I wanted a slider thingy ages ago, so this was a good prompt to make a demo of the front end part which takes the values from slider components and uses them in a SPARQL query.

For convenience/lack of available data the demo runs against dpPedia via the SNORQL SPARQL Explorer. As furniture and it's dimensions wasn't available it uses cities and their populations and elevations.

So how would you get real data?

First of all, furniture vendors could either provide dumps of their data or, more Webby, mark up their sites with RDFa and/or HTML5 microdata using e.g. the GoodRelations e-commerce vocabulary.

Ultimately, for a front end like these sliders to work, the data would need to go in a store with a SPARQL endpoint. But, triplestores shouldn't be thought of as just a wacky alternative to a SQL database. A triplestore is just a cache of a little chunk of the Linked Data Web. The question of where the store resides and how the data is collected is entirely open. Following the more traditional DB model, a service might aggregate the data published by known furniture suppliers and provide the endpoint online.

But alternately, a local user agent (I think Chris Bizer had a little Java example, can't find the link...there are others) could crawl the Web to answer the query just-in-time. The advantage of this approach is that it's more thorough and the only real option for totally arbitrary queries, the downside being that it's answer will probably take longer than milliseconds. But remember triplestores are caches, not every little bit of information would have to be discovered and read from every page. There are vocabs for dataset and vocab discovery (remind me of the acronyms please :) Note too that you're not limiting your client agent to a single datastore. traditional backends (SQL or NoSQL) are effectively isolated silos, triplestores are integrated with the links of the Web.

Incidentally, this is something that might be nice to express as a Web Intent, along the lines of "make me a query from this template with these parameters and apply it to this endpoint, putting the results into this widget" (that's a bit verbose for a general-purpose intent, but you get the gist). c.f. RDFAffordances.

The Emperor's New Client

A wee rant.

Ok, I'm totally with the consensus that the future is Cloud-based, and to be a little more specific Platform-based and to be even more specific primarily HTTP-based. To back that up, cf.

Michael Hausenblas's new report
Mike Amundsen's recent blog post -
Steve Yegge's awesome rant at Google

But to expand something I mentioned in passing here recently :

in one respect the emperor is stark-bollock naked. Browsers are currently a really sucky environment for client development. Sure, the HTML/CSS-based (standard!) rendering is wonderful. As shown with Node.js (and despite what Google are saying around Dart), Javascript is a reasonably pleasant, perfectly capable programming language. The growth of Ajax and JSON have shown inter-system comms is workable. There are some good dev tools and libraries. So why does working with this stuff feel like pulling your own teeth?

Here I could point to the traditional DOM API, blame the W3C for all the world's ills and an awful lot of people would nod and smile knowingly. But although that's arguably valid (heh), I reckon the problem is more systematic and can mostly be blamed on browser developers.

Ok, blame is too strong. The decisions made over the years and the directions taken have generally been perfectly rational in the context of the prevailing conditions. But there have been feedback loops at work. The flashy [sic] chrome [sic] surrounding HTML dev, from the img tag onwards, has pulled Web developers in like moths around a flame. So the browser developers act to improve that experience. Meanwhile server-side tech has developed out of the corporate legacy of silo-based systems. Let me quote Steve Yegge there: "It's a big stretch even to get most teams to offer a stubby service to get programmatic access to their data and computations.". The way services are offered over the Web, even Web 2.0 services still have a big hangover from this mentality. I'd argue that most Web APIs are only marginally better than SOAPy stubs. Largely because XML and JSON aren't particularly Web-friendly. Ok, don't bite my head off, let me qualify that.

First XML. There have been plenty of arguments over the years around XHML, and back in the day (I wonder how old that phrase is) there were arguments about the XML nature of RSS. Postel's Law, the "Robustness Principle" got cited a lot. Let me give you some deja vu:

Be liberal in what you accept, and conservative in what you send.

What a lot of people misinterpreted was the keyword robust. A robust system is one designed to be able to fail gracefully or continue working acceptably with noisy data. That's exactly what we want for the Web, right? Well not necessarily, if I was ordering a book from Amazon, and there was a partial failure, I'd rather they didn't make a best-guess when it came to taking money of my credit card (I think paraphrasing Tim Bray there). Anyhow, XML is not robust, by design. XML is designed to bail out completely at the first sniff of anything dodgy. As it happens, the way XML is often served on the Web is without proper regard for the media type, i.e. dodgy and hence broken.

Sorry, that was gratuitous deviation, the real reason I'd say XML isn't Web friendly, like JSON, is in the way people use it. Whether data is conveyed as name-value pairs or through more complex structures, the key parts are generally just simple strings. But by itself, a string on the Web is next to useless. You or I can (maybe) read it, or even paste it into Google and get a definition. But what is a poor machine client to do? What makes the Web are links. It's 101 but somehow still manages to be overlooked: the link has two facets: a universally unambiguous name (URI/IRI) and a protocol for following it (HTTP). If a client on the Web encounters a link, it can follow its nose to find out more information about it. That's what we as humans do in browsers all the time, yet when it comes to Web services for some reason a simple string is seen as adequate to identify something.

Ok, with XML, the HTML DOM and to some extent JSON there's been some justifiable resistance to the use of URIs for names, because namespaces have traditionally been uninuitive at best and agony at worst. Using URIs instead of simple strings certainly adds a burden (it doesn't have to be that great, check Turtle syntax), but its benefits far outweigh the costs.

The thing is, you'll hear talk of snowflake APIs - only one implementation of each exists - but what gets overlooked is that by their very nature, most APIs just aren't Webby. The client must have prior knowledge that the service at endpoint X uses API Y. What you end up with is effectively a series of 1:1 client-server connections. That, while the uniform interface REST may mean it's less brittle than an RPC connection, still means tight coupling.

Ok, you might argue, that for any communication to take place, some prior knowledge is required. Sure, but that can be minimised - just like the way we follow links for more information in a browser, a service client can follow links to get more information. This is only a small conceptual step, but what it enables is hugely powerful. Above everything else, it's what Linked Data and the Semantic Web gets right.

I reckon that browser developers, with their emphasis on doc-oriented HTML have a natural tendency to carry their experience in that domain across and apply it to data. Naturally namespace-less XML and JSON will seem preferable through that lens. But in practice, documents and data are apples and oranges. Browsers have been optimized over the years for the former, incidentally making the latter harder than necessary.

It's funny how you don't hear so much about service mashups these days, despite their undeniable coolness. I'll assert that it's because developing for Web data in the browser is bloody hard work, especially when there are NxN arbitrary API mappings to know.

Overall it's actually something of a miracle that the notion of cloud-based platforms has emerged.

I had planned to say more about Cloud Computing Outside of the Browser - or to put it another way, evolving old-fashioned non-browser Rich Internet Clients (as well as server-server and every other non-browser configuration). But ranting's worn me out. Anyhow, in short, I reckon that for the forseeable future, non-browser clients in many circumstance are probably preferable to browser-based equivalents, primarily because they're easier to develop (as I keep saying, I reckon the agent model of combined client/server units is a good way to go). While I personally welcome HTML5 and the APIs as a clean-up of document markup and processing, when it comes to data it isn't even a Band-Aid.

danja
2012-01-09T20:00:25+01:00
apis cloud browser services rdf
Related
Comments

Dart H. Vader

I just heard about Dart (via Seth Ladd and Edd), a new Web programming language from Google. It aims to fulfil the role Javascript currently has, only doing it better. On the pro side, new languages are inherently cool, and Javascript can be a real pain. On the con side it seems unlikely that any browsers other than Chrome will support it in the foreseeable future, except potentially via translation to Javascript, i.e. This Page Best Viewed with Chrome

It's hard not to see echoes of the old Microsoft arrogantly pushing it's own product here (remember VBScript?), although Google have in recent years made NIH an artform. But who cares about politics, how's this going to affect the Web?

Well, Code-on-Demand does appear in Fielding's thesis (slightly bizarrely as an 'optional constraint') and has been around since the early days. Pluggable clients are certainly a good idea, and Google have been leaders in moving Rich Internet Applications as opaque desktop apps into the browser using Javascript. The apps are still pretty opaque (View Source on gmail if you doubt that) but they do at least more-or-less run cross-browser.

I've not read much of the Dart docs yet, not tried it at all, but first impressions are that it's a nice clean syntax not unlike JS (or for that matter Java, C# or Python...) and they've already got a good bunch of libs together (even if they do include RPC, yuck!).

As an aside, it should be noted that there's a cost to the standardization of today's browser as Web client (in the process of being defined via HTML5 and associated APIs). It does mean an effective monoculture of HTTP clients. Arguably you can write whatever kind of client you like (probably in Javascript) and host it inside a browser, but they have been optimized for a fairly specific app scope. If you stray from the general model of a Web of HTML Documents you're in for an uphill journey. The arbitrary desktop client has more freedom to use HTTP more creatively, but then there won't be one on everyone's desktop. (Personally I like the notion of Web agents (where an agent = client + server + persistence + code) as an abstraction for Web components, as in "Two Webs!" [pdf - heh]. I wonder, is there a HTTP server in Dart yet?)

Looking at the "Leaked internal dart email" (as with UK politics, it's probably sensible to take the "Leaked" aspect with a pinch of salt), there does seem to be some motivation for Dart coming in response to the success of iOS. I'm pretty sure a new language isn't the best response to this, but it certainly makes a change to the usual big proprietary Flash/Silverlight kind of issues. Google are still talking of evolving Javascript, but it does raise the question of what Dart will offer that couldn't be achieved using JS. Optional typing is the feature they seem to be plugging most. So I wondered if anyone had worked on adding static types to JS. Funnily enough, the first few hits refer to iOS. Oh dear, we're really not talking iOS envy, are we?

It's a little surprising that Google haven't thrown their expertise at the JS-is-a-mess issue previously, I don't see a groundbreaking dev tool and pattern library out there (funnily enough the Dart Editor is based on Eclipse, which does seem a bit un-groundbreaking (although I'm not criticising the choice, Eclipse is my main IDE)).

Whatever, it should be interesting to watch how this pans out. Dart will almost certainly be a very cool language, albeit engendering ambivalence everywhere outside Google. Give me a shout when it includes libs for non-HTML Web languages (i.e. gimmee RDF :)

Comments (G+)

danja
2012-01-06T20:48:18+01:00
google language programming dart rdf
Related
Comments

Listy Thing - note to self

Spent this morning having another go at sorting out my lists and links. The aim is to keep them in a triplestore (probably Seki/Fuseki/TDB/Jena) and to be able to add, organise & edit them in a browser. I'd better leave this, have a nap then get on with something else now. So to help me remember where I'm at:

Rearrange & in-place edit (with jQuery) worked on test page, doesn't yet work on real data (crashes browser!)
Editory thing - four-pane CSS seems ok, CKEditor looks good for rich content, need to play (not tried with above yet, not sure about cross-list D&D)

Did a dump of links from Chrome, ran through Tidy, XSLT (xsltproc) to ul/li and split into separate lists - basically working ok

tidy-default bookmarks_1_6_12.html > bookmarks.xml

xsltproc lib/bookmarks-split2lists.xsl bookmarks.xml

for file in *; do mv "$file" "${file}.html"; done

Vocab - no idea for textual list items, Annotea has http://www.w3.org/2002/01/bookmark#Bookmark, also Tag Ontology

Everything in github under hyperdata/lists

PS. got some dump-from-del.icio.us code tagliatelle. Need to try AndyS's rdf:List with SPARQL 1.1 Update stuff.

danja
2012-01-06T15:15:08+01:00
lists bookmarks links rdf
Related
Comments

Scutter's Mate

As I was admiring the Linked Open Vocabularies Endpoint (LOV-E) it occurred to me that the vocabs I maintain (well, create and forget...) aren't particularly discoverable. Even before saying they're vocabs, there's not necessarily anything linking in to them (yes, really forget). Ideally I suppose I should put together a proper Semantic Sitemap, but for now I've thrown together a quick and dirty directory walking script in Python: scutters-mate.py. It produces a Turtle listing of the RDF files it finds (by filename extension) containing entries like this:

<http://hyperdata.org/xmlns/meta.ttl>  rdfs:seeAlso <dogmood/index.ttl> .
<dogmood/index.ttl>  rdfs:seeAlso <http://hyperdata.org/xmlns/meta.ttl> .
<dogmood/index.ttl> format:format <http://purl.org/stuff/formats/text/turtle> ;
            rdfs:label "text/turtle" .

Here I ran it in the /xmlns directory and saved the output to xmlns/meta.ttl.

I'm thinking I'll also run it from the root of all the domains I use, then try and remember to link to /meta.ttl wherever appropriate to give the scutters a helping hand.

Comments (G+)

danja
2012-01-04T20:16:46+01:00
sitemap scutter vocabs rdf data linked
Related
Comments

Web Beep - where next...

Minor tweaks aside I've got Web Beep to a good milestone, basically proof-of-concept.

Boxes ticked:

A good point at which to put it on one side and get on with some rather more pressing bill-paying stuff for a while.

But it'd nice to have a clue on next steps. There are a few potential directions:

Ports

The obvious one is in-browser Javascript. While the HTML5 APIs look the best route long-term, it's not so obvious right now. There are things already around like making .wav data: URIs, and also dynamicaudio.js - which looks very promising, it supplies a Flash player for browsers that don't support the API. Until very recently I expected there to be a need for DSP libraries (there is a dsp.js) but as it happens it only requires trivial stuff and there's the Java to refer to, all easily hacked. (The only "serious" DSP bit is the Goertzel algorithm, but that itself is easy-peasy, already done: goertzel.js, literally only took a couple of minutes).

There might be uses for desktop UI-based codecs, but I don't know what...I might well hook something up to the current implemetation, see if it inspires.

Some kind of mobile device app should have potential.

But this all is all very tied to another dev direction -

Applications

What to do with the darn thing? danbri's put some good ideas down with ChirpChirp (that I've still not fully digested).

Nicholas J Humphrey had a brilliant suggestion, use them on radio - nearly every programme these days (BBC R4 at least) seems to read out one or more URIs.

I've not got a smartphone so am pretty clueless about that kind of Apps, but presumably there are a few around there.

Doing stuff with DSP and/or GA and/or RDF

Building the thing led to a couple of collateral proto-products: a little genetic algorithm-based optimizer and the makings of a DSP vocab/ontology.

There has been work done already around DSP and semweb tech by the dbtune and omras folks. The Henry service is a sweet example of the kind of thing that's possible, it's "...able to perform audio processing tasks to answer a particular query". The shape/scope of their ont does seem a bit different to what I've been finding, though obviously there's overlap. My inclination is to derive what's needed from the running code then later align it with their material.

With a reusable system-description mechanism in place (i.e. a DSP vocab) it should be straightforward to apply the genetic algorithm optimization setup to any system which depends on a bunch of parameters and has a notion of fitness.

I've also got a few other personal tie-ins with this - the opportunity to tie the DSP (and analog SP) bits to the SPICE in RDF stuff I was playing around with last year, and going back somewhat further, updating the RPP vocab from over a decade ago (I'll get these things finished eventually...). From a suitable level of abstraction there looks to be interesting potential overlap with data processing too - check David Booth's RDF Data Pipelines for Semantic Data Federation.

danja
2012-01-03T14:21:37+01:00
ga pipelines genetic webbeep algorithm dsp web rdf beep
Related
Comments

Web Beep

I've just gone live with a little fun service : Web Beep - enjoy!

Comments to G+ please

danja
2011-12-31T19:22:33+01:00
audio dsp web rdf beep
Related
Comments

DIY Rehab

Long story short: I've spent the last couple of months doing rehab on myself. No New Year Resolutions per se, but I'm coming back into the land of the living.

Ok, for starters I do like to drink too much too much. As well as going for it for fun, booze has always been my fallback when stressed or fed up. Given that no matter what mood I'm in a few glasses of wine are likely to improve it, it's easy to see how I developed a dependency. Over the years it's caused no end of problems of every kind, but as they say, old habits die hard.

But this last year I started pretty well - first few months totally abstinent, then relatively moderate for a couple of months. Actually managed to go to a conference and having a few beers without making a complete tit of myself, losing valuables or missing my transport home. Bravo Danny.

Some time afterwards I realised I'd slipped back into having something to drink every day. Not good, especially since I was also spending an awful lot of time lying on the settee watching rubbish on TV. So I went abstinent again.

Problem was, even weeks later, I still found myself lying on the settee watching rubbish on TV. Didn't feel inspired to do anything, was letting work slip (despite an empty bank account). Not really getting any pleasure from anything. Finally I did break the abstinence, had loads to drink over a couple of days and at the time felt great, except immediately afterwards was back where I'd started, plus nursing a monster hangover.

Had to act. Proper rehab was an option - my mother was due to be coming over for another long visit (she's here now) so the animals would be ok if I went away. But I was already able to be abstinent, no problem with that in itself. Just needed to clear what would be clinically diagnosed as depression. Not a dramatic woe is me hell-pit, just a general abscence of enthusiasm for anything. (Once upon a time some nasty scratches on my arm were mistaken for self-harm, whereas in reality they were a result of a cat reacting to a visit to the vet...). Years back Caroline suspected I was a bit bipolar because I did sometimes get manic, but that would always be after an extended period of serious boozing. Most people would probably get the same way under the circumstances. Without chemicals I'm pretty level over time, even if that level might not be where I'd like.

So I decided to try rehab at home. Staying sober: taken as read. Removing temptation: not a problem. Removing things likely to cause problems, stress and so on: hmm. The only way I could bring myself to do that was dropping everything, irrespective of commitments. So I did. I simply stopped looking at email (and the online social nets). But as I was already screwing up work, it seemed to make sense to choose the lesser evil long term. Then there's the other bit: therapy.

I've talked to a few therapists over the years from a variety of schools of thought, not personally found any of much use. Similarly I've tried various medications, again, not worth the effort. But to be able to get back into doing productive things, I definitely needed some kind of what might loosely be called occupational therapy.

The answer seemed to be to spend time doing things that I've enjoyed in the past. Basically playing, but it's a little different when play has become a chore.

So that's what I've been doing the past couple of months. Playing. It's seems to have worked, got me out of the immediate pit at least. I did have a brief go at a bit of fresh work-work that looked less demanding than what I usually go for, but backed off soon after starting. Fingers crossed that will still be available now I'm about ready to crack on with things.

Without planning it that way, my play activities have pretty much corresponded to things I should be doing. I've been busy with a fun dev project [that just went live - more on that shortly], but it's a bit removed from the Web of Data stuff. That's involved research, but again a different domain than usual. I've been writing, but not tech stuff, rather tidying up old write-ups of various hobby projects, adding some little bits [some done, but it's ongoing]. I've been doing a fair bit of woodwork, but rather than the long-overdue bits of house restoration it's been making Christmas presents (the fact that I'm too skint to buy anything has been good motivation there) [some still to finish, heh]. Rather than spending time on social nets to engage with the rest of the world I've been working on a little music video project [mostly done, will release in a couple of weeks].

I've not been doing any fresh music to speak of, not least because I was getting frustrated at not managing to produce stuff I liked. I think maybe I was pressurising myself there a bit too - even though it's meant to be fun stuff, I was treating it like I should be doing it. I'm pretty sure I'll be able to get back into it. (Incidentally I've been listening to a mix of Robyn and Led Zep recently, go figure).

Check back here in a year's time to see how well the DIY Rehab strategy has worked. In the meantime (a few hours early) -

Happy New Year!

danja
2011-12-31T14:14:06+01:00
rehab personal
Related
Comments

Tooling About

I'm a bit of a tool fetishist, both with software and real-world stuff. As well as accumulating them, for various reasons I like making and using my own tools. The most valid reason is that it's usually fun, though some of the other reasons are rather spurious. It often seems the time spent learning to use someone else's piece of software effectively could be better spent writing something from scratch. In itself that usually turns out to be a mistake. Likewise, most of the time making a real-world tool turns out to be an inefficient approach, given the time and money that can go into it compared to simply buying a good quality mass produced version. Sometimes the extra value that comes from having a result that's more exactly fit for purpose than the alternatives is good justification for DIY tooling. A more common benefit comes simply from creating something that's more personalised than the off-the-shelf version and because of that more pleasing to use. Making things is satisfying, making things that you use to make other things doubly so. Within reason. I suspect I may be crossing the line with my latest little hobby project.

Very low down on my list of priorites, I need some new business cards. Ok, half an hour's design tops. But I've got some pieces of (European) boxwood I collected a few years ago, nicely seasoned now, and for a while I've been thinking about things to make with them. Boxwood is lovely stuff, very hard, tight grain, nice appearance. One of its traditional uses is block printing (it's what Bewick used for his animal prints).

Right, so I'll block print some business cards. That means carving the letters, and I haven't really tried lettering before. First big question, which font? The obvious answer is of course to design a new one, optimised for woodcarving. A big factor there is taking into account the shape of gouges, aiming ideally for a minimum number of tools and a minimum number of cuts. FontForge seems a pretty good piece of kit and you can use images as guides for the typefaces. This gives me an iterative process: print out a typeface I like, try carving it with a limited set of tools, note what works and what doesn't. Photograph the results, use them as a template for a revision on the computer.

Earlier today I started on this, and almost immediately hit a snag. While I can more or less get away with the larger gouges I've got for the design part of this, I don't really have the shapes among the smaller ones, the ones I'd need for business card proportions. This is despite having about 85 gouges/chisels/knifes at hand (I got them all out on the bench earlier, couldn't resist counting). This figure might sound evidence of serious obsession when, say, 5 decent gouges/chisels is plenty for most projects, but traditional woodcarvers would often have a lot more (Grinling Gibbons is reckoned to have had about 300). They all come in handy somewhere and just turn up over the years - junk shops, inherit, even occasionally buy new.

Coincidentally, the other day I was reading about using hook tools for woodturning (Robin Wood, author of the remarkable The Wooden Bowl uses them on a pole lathe) and it turns out you can make them yourself using a blowtorch, masonry nails and a big hammer. I'd always assumed you needed a forge or oxy-acetylene to work hard steel. Aha! Earlier today I tried the technique, made a little chisel (woodcarving chisel, slightly different than regular woodworking chisels). It's a bit wonky but perfectly usable. Took maybe an hour, including making the handle (boxwood). So while I play with the typeface design with the aid of existing larger chisels, I plan to make some smaller ones for business card purposes.

Oh yeah, and over the years here, whenever I've seen an oak apple I've taken it home, with the vague idea that one day maybe I'll have a go at making some ink. Then of course I'll need to make a press for the paper...these business cards may take a while.