Saturday, October 31, 2009

XML for Dummies

XML


XML: extensible markup language. It's a different language. Different from HTML,hyper text markup language. HTML focuses on describing how a Web browser should display images and text on the screen. It merely serves as a way to format the information on the page that is presentable to users.There is no meaning to the browser what the elements (images and text) represent. Users infer the meaning based on the context of the elements. If numbers appear on the screen that follows a date type format, then they recognize it as a date. To the browser, it is just numbers separated by slashes.

XML is a means to replace HTML. It is a language that is designed to be structured, with meaning, while allowing for data exchange. How would XML replace HTML? There are 2 file types. First, there is the XML file. This file defines the elements within the page. There is so much information out there in the interwebs, how do we define everything? XML allows for data modeling with the Document Object Model (DOM). In conjunction with a Document Type Definition (DTD), there is now a data model and a corresponding definition for each data element. Predefined tags are used to separate the different elements and data. By looking at an XML file, one could determine the meaning of the document without having to look at the website and determining its meaning through the context in which it is placed. Elegant isn't it?! So by using a standard set by organizations with a common interest, it would be possible to share data between them, all through the content on the web site.

Because XML focuses on the meaning of data and elements, it does not touch upon the visual aspect of how such information is displayed on a web page. To remedy this, one must use a stylesheet. The Yang to XML's Ying is the eXtensible Stylesheet Language (XSL). The XSL document defines the rules for displaying an XML document's data on the web browser. With the power of the two working together, one is able to replace a merely visual interpretation of a web page with a more meaningful and data rich document.

Findability...yes, it's a real word.

SEO


Everyone wants their 15 seconds of fame. For the online world, that fame is equivalent to being listed first in search engine results. Studies have shown that users view results pages from left to right, top to bottom. Seeing a site being listed first usually signals to the user, 'Hey there Mr User, I am exactly what you're looking for. Your key words exactly match my keywords. Click on me and you'll find what you're looking for!"

The better way to be found on the interwebs is to be listed in the major search engines. The best way to be found on the interwebs is to be listed first in the major search engines. It is of great importance to be listed first in a search result. You have the honor of being the most relevant site and having the highest chance that the user will click the link. How many pages do you personally scroll through before you change the wording in your search to find what you're looking for? If I don't find what I'm looking for in the first couple of pages, I rephrase my search, hoping for better results. Here is where it gets tricky for websites, in terms of content management, to produce better site exposure and findability.

In contrast to Search Engine Marketing, Search Engine Optimization uses natural or organice ways of editing and managing onsite content to increase the rankings in search engine results. The reason this is important is because 70% of traffic comes from organice search results, in contrast to the 30% for paid ads in SEM. So how do we optimize a site so that it appears as a relevant site for search queries? We have to first look at how search engines parse the information on a site, what weight is given to what elements, and how the totality of those elements affect the standings.

Search engines have become complicated affairs. They have incorporated a wide number of factors and algorithms they use to evaluate a site. And of course, these algorithms are trade secrets, you wouldn't want the world to know how you make your search engine go VROOM VROOM would you? Then other sites would produce the same results and web sites would edit their sites to make sure they're listed as #1.

On page factors, the content that we can control, plays the major role in creating a level of visibility in search engine queries. Here, we delve deeper into the architecture of the webpage. The Title Tag, body, and header tags all play roles in creating what is called keyword density. Even the domain name should have something relavant to the keywords if not the keywords themselves. If there is any usage of images, make sure descriptions for the images also relate to the content and keywords. Another trick is to have the keyword as part of the filename itself. On site links also play a role; these include links to other relevant sites as well as outside links referring to your own site.

By doing some research to determine what keywords generate the most search queries relavent to your site, you can then optimize the content on your site. But because each search engine runs a different algorithm, it will be hard to find one format that will flourish in every search engine. This will be a game of cat and mouse, determining what works for each different engine. You must take into account, also, that changes made to your site might not be acknowledged by the search engine right away either. There may be some down time before you see a change in the volume of traffic, or lack thereof.

Sunday, October 25, 2009

Its All About The Semantics

Semantic Web


It can be a tortuous journey crawling through the endless interwebs of information in cyberspace. This is because the internet and World Wide Web is fairly young, still in its infancy. The search engines we use can sometimes struggle with our search queries because it quite doesn't understand our homo sapien thought process. It cannot tell the difference between Paris Hilton and the Hilton in Paris. This is because it bases its search on keywords and not the meaning behind the words. What if search engines and the interwebs were smarter? What if it understood the natural language process that we use? Curtains open, music plays, and drum roll, let me introduce to you the notion of Web 3.0, aka the Semantic Web.

By using a set of standards, the World Wide Web would become one big database. But this is a large undertaking because the entire World Wide Web would need to be re annotated. The metadata on each page would need to be updated following the specifications of RDF, OWL, and XML, to name a few. The goal is to create a more intelligent Web. One that is understandable by computers on the same level as it is for humans. One that would not only look at key words, but take into consideration the relevance and context of other words associated with the key words. The Semantic Web is about using a common format so that data can be combined as well as integrated. It is also about how data correlates to real world objects. One that could serve as a personal assistant of sorts. Today, if you wanted to grab dinner and a movie, you would search for your movies options by location, time, and theatre company. Then you would search for the various restaurants in your area that may be of interest any maybe sort them by customer ratings. Only after you visited a number of sites would you be able to make your decision. In a Web 3.0 solution, you would be able to ask for a dinner and movie recommendation in your vicinity. The search engine would analyze your query, search for all possible answers, and then organize the results for you. It would be as if you asked 'someone' else to do it for you and then come back with the answers. It would be able to take complex queries, rationalize them, and returning an ordered set of results.

Recently in the headlines is a government site, Recovery.gov. The Obama administration is using these next generation web technologies to allow the general public view how the economic stimulus of about $800 billion dollars is being spent. A quick overview blurb of the project can be seen here.

Wednesday, October 21, 2009

Show The World

YouTube


Now that we have created our content, how do deliver it to our viewers? You can decide to use a service to host your videos or utilize one of many websites on the interwebs that focus on user generated content.

If you decide to use a service or host this yourself, there are a couple of decisions you would have to make. There are 2 types of video delivery mechanisms; progressive download (web server) and streaming (streaming media server).

In the progressive download method, the media file is hosted on a typical web server. The web server treats it like any other file and sends the data out without regards to the contents inside. You cannot view the content until the file has been downloaded. Once an initial amount of data has been loaded, you may watch that portion. You are not able to skip ahead until all the data up to that point has been loaded. Sometimes, you may have to wait until the entire file is downloaded before you can start watching. It works in a very one way interaction. The file is saved on the computer temporarily so that it can be viewed without having to download the file again.

In the streaming download method, the media file is hosted on a streaming media server. The media server has specialized software that knows about the bandwidth, format, structure and the performance of the player receiving the media file. As such, the server streams the data at an optimum rate. Because there is a 2 way communication between the server and media player, there is more control over the stream itself. Users are able to skip around to different points in the media file and the server will send the appropriate data. The video file is not store on the computer as it is done with the progressive download method.

An analogy is to view the different methods as different ways of drinking juice. In progressive download, you cannot drink the juice until it has been poured into the glass. You have to wait for it to be poured. With streaming download, you may drink directly from the bottle, at any time you wish.

To sum everything up, the main reason for downloading from a web server is that it uses existing infrastructure and is fairly simple. It’s good for short video content with high bitrates that also allows users to save the file. Streaming is better when you have longer clips and want to allow a bit of user control or are streaming live webcasts.

But that would be the hard way. You can head right over to any of the dozens of web sites that host user generated content for free. YouTube is by far the most dominant player commanding a 43% market share and being the fourth most visited site on the internet, it is a great place to get exposure and take advantage of the large user base and audience that it provides.

YouTube allows videos that are up to 10 minutes in length with a file size of 2GB. It accepts the most common formats that include .AVI, .WMV, .MOV, .MKV, MPEG, .MP4, DivX, .OGG, and .FLV. YouTube offers 3 different formats. There is a "high quality", an HD (high definition), and also mobile format. One of the best features of YouTube is the fact that the videos are viewable outside of the website. Each YouTube video is paired with a snippet of HTML code that allows it to be embedded on web pages outside of YouTube. Talk about portability! TIme to get crackin!

To read about capturing video...visit Chris' blog

To read about video compression...visit Keith's blog

Thursday, October 15, 2009

Web Analytics height="500" width="450" > value="http://d1.scribdassets.com/ScribdViewer.swf?document_id=21154701&access_key=key-dlkiun3b0xqrvj6irn8&page=1&version=1&viewMode=list">

Thursday, October 1, 2009

Google Docs...The Doc of All Trades?


Google has been pushing their cloud computing software as a service for some time now with their software suite called Google Docs. What would compel someone to use their services? The cost. It is a free application. It does not get any cheaper than that. There are many advantages to Google Docs as well. The documents are stored on Google's servers and they accept most formats; including Microsoft's Word, Excel, and PowerPoint files. You can invite "collaborators" who can view and edit your document in real time and the changes are reflected immediately. This is a great feature and time saver as our database management team used Google Docs to collaborate on our project from different locations. Having the documents stored on their servers means you have access as long as you are connected to the interwebs (and assuming their network is up and running). I must say the interwebs are most interesting!
http://www.google.com/google-d-s/tour1.html

Another major competitor is Zoho. Zoho also offers a set of online applications that are accessed from their website. They too, cover the gambit of Microsoft's offerings. Do we see a common denominator here? Zoho is also free, though they do have paid services for higher tiered applications for organizations. Zoho seems to have a more extensive collection of applications; of which include web conferencing, a database application, and online invoicing to name a few.

The common denominator here is Microsoft and its software suite. I think it is safe to say Microsoft has a strong foothold on the word processing, spreadsheet, and presentation market. So when Google and Zoho came along and offered free and comparable products, people would naturally gravitate towards the new offerings. More users using Google Docs and Zoho meant less users using Microsoft Office. Now they can't let THAT happen can they? Their answer, Microsoft Web Applications 2010. This has just entered testing and has not been released to the public yet. Rumors have it that this would be a free service. How could they charge when their competitors are just giving it away. What they will try to accomplish is to provide the same consistent look and feel of the application that their users are already accustomed to across all channels; this includes the desktop software, the online web application, and through mobile sites on smartphones. Let me state this, Google Docs and Zoho both can be viewed through the mobile web. I tried the Google Docs through my iPhone and I don't understand why anyone would want to, unless their phone was the only available connection to the web they had. Editing a spreadsheet is a mind numbingly cumbersome activity.

So it seems that all 3 competitors are on equal playing grounds. Google has the brand recognition behind it, Zoho has an extensive software suite, and Microsoft has a user base that spans corporations and countries. I believe it will come down to ease of use. Simple as that. How easy can they make it so we can what we need to get done quicker and easier? A simple and intuitive interface with a quick learning curve. I will hold my judgement until Microsoft's Web Applications goes live.