Search

Loading

Newsletters



Sign up for technology and financial newsletters

Browse by Topic

BlogRSS Feed

Commentary on news and trends in the fields of messaging, content control, archiving, compliance, e-discovery, and data leak prevention.



Exchange 2010 includes archiving and e-discovery; these need powerful indexing capabilities.

Third-party on-premise archiving vendors struggle constantly with indexing technologies. Indices get corrupt and take days or weeks to regenerate. Searches return results that aren’t as expected, or aren’t understood. Indexing technologies age, and when a vendor replaces them, your corporate memory looks very different.

To understand the challenges, think of Outlook:

  • It’s often hard to find email in PSTs.
  • You frequently don’t get what you’re looking for.
  • You get “indexing is not complete” messages.

Now consider e-discovery on a corporate scale. Searches become critically important. For example, you may need to defend your CEO against accusations that might land him in jail; your CEO is certain that an email is there, but the search tool can’t locate it. In the meantime, you have five days to find the email, and the clock is ticking.

Exchange 2010 includes a new discovery module that searches primary and archive mailboxes, and works across multiple mailboxes. It is built by one of the smartest teams in the Exchange product group. However, it’s unclear whether or not the search will be good enough. If it’s like our experiences with Outlook search, the answer is no.

We think Exchange search should be a lot better than that of Outlook. Nevertheless, the challenges are substantial, and there is a good possibility that it won’t be up to the job. For example:

  • Important file types may not be supported.
  • Documentation may be unclear on how to adjust the index, and when adjustments need to take place.
  • Users may not understand the results of a search.
  • There may be problems with non-English searches.
  • Wildcards and stemming support may be limited.

We would welcome input from readers on their practical experience with Exchange 2010 searches in the stressed and demanding environment of e-discovery.

David Sengupta

The single most important technology change of our time is happening around the cloud. Entire industries are being changed or made obsolete. The world as we know it is changing fast.

Examples of what has changed:

  • Paper-based maps and manual navigation have been replaced with cloud-based maps and GPS systems.
  • Photography has shifted from film-based to digital, with processing and often photo albums primarily cloud-based.
  • Books are moving from paper-based to electronic paper or digital, with cloud-based ordering and digital library functions.
  • Email systems are transitioning from on-premises to cloud-based solutions.
  • Print media and advertising have largely gone online.
  • Customer relationship management (CRM) systems have moved online.

And examples of what is likely:

  • Enterprise telephony will shift from on-premises PBX to cloud-based VoIP solutions.
  • Home-based landlines will be replaced by mobile phones.
  • Backup and recovery systems will move to the cloud.
  • Systems management will move to the cloud.
  • Archiving and compliance will move to the cloud.
  • E-discovery solutions will move to the cloud.
  • Systems with heavy processing requirements will shift to elastic compute technologies.
  • File storage will move from local computer-based or external hard drive-based to primarily cloud-based.
  • What remains of fax will disappear.
  • Human-operator-based conference call bridges will go away.
  • Translation services will be mainly automated and go online.
  • Paper-based billing will be completely replaced by online.

In each case, the primary technologies and delivery mechanisms will move to the cloud. On-premises solutions will become the exception, not the norm.

As you navigate town with your GPS, BlackBerry by your side, with your new Kindle reading your ebooks to you via text-to-speech as you drive, we suggest you reflect on this. The shift towards the cloud is compelling, and in some ways, irresistible.

David Sengupta

We’re hearing a series of rumors that something is going badly wrong at Dell/MessageOne. Eg:

  • They’ve lost a huge amount of customer archived email over the past couple of weeks
  • Many customers are making inquiries about other vendors and their ability to ingest/absorb their historic archive data
  • One vendor told us they had been asked to help customers move their emails back from Dell/MessageOne and the most efficient way to ingest large amounts of data (10 TB for example)

Can anyone helps further? If you have input, please send to me at david.ferris@ferris.com or call me on +1 415 367 3436. If you prefer to be anonymous, let me know and we won’t publish anything that identifies you.

David Ferris

There are three big players in the professional networking business: LinkedIn, Viadeo, and Xing. LinkedIn is best known in the United States. But France’s Viadeo and Germany’s Xing are important competitors. Viadeo recently paid us a visit. The company’s focus is on Europe and emerging economies. It has 25 million subscribers. It plans to enter the U.S. market through leveraging its international subscribers, such as those in Canada and Mexico.

Main services from the user standpoint:

  • User directory.
  • Users can belong to many different interest groups.
  • Each interest group provides:
    • Common bulletin board/wall
    • Directory of interesting events
    • Q&A/forums for advice sharing
    • User-contributed news

Finances:

  • Most use is free.
  • Around 10% pay for services. This is pretty good compared with LinkedIn, which appears to have around 1% who pay.
  • Fee is around 6 euros ($9) per month if you’re a paying subscriber.
  • Ferris Research estimates revenues at $40 million annually. The company is profitable and growing at a healthy clip. It’s taken a total of 15 million euros ($23 million) external funding.
  • Revenue breakdown:
    • User subscriptions: about 50%
    • Recruiting tools sold to HR and consultants: about 25%
    • Advertising: about 25%

Observations:

  • The collaboration tools are conventional but attractive.
  • The rich collaborative environment means that in principle, Viadeo can offer extremely targeted advertising, for which advertisers should be prepared to pay handsomely.
  • Viadeo’s collaboration tools are an important point of differentation from LinkedIn, which is directory-centric.
  • Viadeo’s localizations--e.g., it is currently offered in Dutch, English, French, Italian, Portuguese, and Spanish--are another important differentiator.
  • Around 10% of subscribers pay up. This is pretty good compared with LinkedIn, which appears to have around 1% who pay.

David Ferris

If you care about compliance, the “Drafts” folder in your Inbox needs to be archived.

Archiving vendors take several approaches to email archiving. Some access mailboxes via MAPI and pull items out into the archive. Others intercept SMTP traffic and journal a copy into the archive. Others copy the database transaction logfiles (’log shipping’) and rebuild the email database for archive reasons.

Journaling only captures what has been sent. MAPI and log shipping capture the Drafts folder.

This story on CNN contains a nugget about why archiving Drafts is important. “A Yahoo! e-mail account was set up so the men and militants could communicate …. E-mails were never sent from the account, but people would leave messages in the draft folder and delete them after reading.”

Whenever a law or compliance regime exists, people will try and get around it. Archiving Drafts is necessary for compliance.

David Sengupta

Exchange 2010’s database strategy is very interesting. The new Database Availability Groups and the benefits they offer for data protection and quick recovery are striking.

Overall, Microsoft is optimistic that mailboxes will be able to grow to 10GB or more. However, it’s unclear how large mailboxes will perform in practice. Several concerns spring to mind:

  • By leveraging low cost disk, it is feasible, according to Microsoft, to store email online for up to 10 years. If you assume a typical user stores 5 MB of new email daily, this translates to 1.8 GB of email per year. Allowing for the continued growth in average message size, this translates to perhaps 10GB mailbox sizes over three years. For 100 users, this means 1 TB of total storage for Exchange. For 1,000 users, 10 TB of email storage! Managing multiple terabytes of storage is no simple matter when you take into account disk failures, power consumption, disaster protection, daily maintenance and monitoring, and so on
  • Database maintenance and ESEUTIL need space. Offline defragmentation requires 110% free disk space. Thus the amount of storage required should be doubled for maintenance purposes. True, Microsoft advises that offline defragging is unnecessary and inadvisable; hopefully this turns out to be wise counsel
  • If we wish to use the new Database Availability Groups for data protection and failover, extra copies will be required. Two to three copies seems likely, which implies another 2-3X in email storage
  • Single instance storage is no longer supported, which means message stores can’t be shrunk by 20% or 30% or so

In a typical scenario, for every 1 TB of primary database storage, perhaps another 3-4 TB of disk space may be necessary for various support capabilities, such as Database Availability Groups. This can quickly translate into a large amount of storage.

All in all, I doubt Microsoft’s claims that imply storing TB’s of email data online is a simple proposition.

Bob Spurzem

Charmingly named YippieMove helps you migrate email from one message store to another. It’s mainly aimed at small businesses and consumers, who are using a cloud-based service. Easy to use, $14.95/mailbox.

It’s just IMAP-IMAP email, no address book or calendar migration services. No, it doesn’t even migrate Notes apps.

David Ferris

Archiving vendor has recently had a couple of small financings:

  • $3M of equity in June, 2009
  • $4M of debt in November, 2009

Mimosa has never made a profit, and the recession of course has made fund-raising harder. We understand an IPO was deferred due to poor results. We understand, too, that Mimosa sought a buyer last year, but the valuations weren’t acceptable to management.

No doubt management is working hard to align revenues with expenses.

We don’t know the details, but presumably the cost of the recent financings has been high. In situations such as this, companies with good technology can find they’re forced into a fire sale buyout. Sometimes a small short-term debt financing is an immediate precursor to a fire sale.

If an acquisition takes place, it is to be hoped that the acquirer will continue to invest in Mimosa’s technology. Sadly, this doesn’t always happen. Many acquisitions end in tears, especially where the purchase was inexpensive.

David Ferris

NOTE OF DECEMBER 9: THIS IS BEING REVISED PER READER FEEDBACK. WE UNDERESTIMATED THE NEED FOR THE USE OF EXPENSIVE STORAGE. FURTHER INPUT WELCOMED

Concern is sometimes expressed about the cost of Exchange 2010 storage:

  • Users will have large mailboxes. 5GB to 20GB will be common. Take 10GB as a typical figure, allowing for three years of mailbox growth
  • Then factor in some multiplier, perhaps 3 or 4, to account for the optional features of Database Availability Group. The multiplier could be much larger, but most organizations won’t go for maximal bloat
  • So the average storage will commonly be around 30GB per mailbox

In short, user mailboxes are set to get substantially larger.

This will translate to many administrative challenges, but cost is unlikely to be one. At today’s prices, 30GB costs from $2 to $30, depending on the storage type. Amortize that over three years, makes $0.66 to $10 per mailbox per year. $10 is an absolute maximum, most storage will cost much less than this. Some of this storage will replace local storage used to keep PSTs. Looked at on a per-user basis, the cost of storage for Exchange 2010 are trivial.

David Ferris

Whistleblower website Wikileaks recently leaked more than half a million text messages around 9/11. We hesitate to even link to the story, but if you must read it, you can find CNN’s coverage here.

There is something sacred about the last messages exchanged with a spouse before a loved one dies. Messages of love. Hurt. Panic. Fear. Impending death. Yet we struggle with a lust for information, and the Internet makes it easy for us to disrespect personal privacy and ethics.

E-discovery brings with it an element of responsibility. Whether investigators are searching email, text messages, or other electronic data, there is an implied code of conduct around how the evidence is to be handled.

Posting these messages online represents a breach of privacy and a rupture in the ethical standards that should exist in any modern nation. In the words of one of the commentators, corkpuller, “deep wounds that have healed need not be reopened for the morbid curiosity of those who want to sell more soap and toilet paper.”

David Sengupta

A highly publicized hack into the email server at a prominent client-research center highlights one of the problems with email evidence. It is far too easy to take things out of context.

In this case, over a thousand emails were taken and posted publicly. This has resulted in numerous accusations of collusion and warped data around global warming statistics in the blogosphere.

Scientists are accused of taking climate change data, and then wrongly adding their assumptions around “corrective” factors to normalize the data. The scientists claim they did so in an attempt to more accurately reflect reality. They claim opponents have not read their papers explaining why they adjusted figures the way they did. Opponents claim the scientists are hiding the fact that global warming isn’t as bad as the scientists claim.

Opinion. Perception. Lessons learned in the school playground. He said. She said.

One of the fundamentals of human existence is that each of us perceives the world through the lens of our particular world view. We live trapped within our context, and cannot be as objective as any of us would wish to be. An ancient Chinese proverb says, “If you want to know what water is, don’t ask the fish.”

When it comes to interpreting the evidence – especially with something as ad hoc as email – it is absolutely critical to make every effort to understand the context within which a statement was made.

Claiming you have found a smoking gun, without having a clear understanding of the context, can lead to fatally flawed arguments. If you are building a case on such evidence, you are on shaky ground.

David Sengupta

There are different approaches to sharing documents, where a group of people want to be able to make comments on a document and alter it.

One approach is Track-Changes, which we’re all familiar with in MS Word. This is very useful. But then, after three or four sets of changes, it gets too confusing. The Track-Changes approach is best between two people; it does a reasonable job of letting you see what changes have been made.

Another approach is one we’ve just been looking at, that of SharedDoc, from SharedBook. Here, someone owns/controls a document. Other people can post responses at arbitrary points to comment on the content. Subresponses of arbitrary depth are allowed. Here, comments aren’t applied, unless the owner specifically decides to allow them. Reviewers must be specifically selected by the owner. This is another approach valuable in a variety of contexts where one seeks reviewer input, such as proposals, legal services, business documents, and publishing. SharedDoc is in beta, by the way, and is currently free.

Some other approaches to editing shared documents include:

  • Wikis, where people can go in and make changes directly, with an audit trail of the changes. You find this approach in Google Wave.
  • Content management-style versioning. You also find this in SharePoint, for example.

David Ferris, with thanks to SharedBooks’ Caroline Vanderlip for her interesting insights

Conferencing technology--such as Cisco WebEx and Microsoft Live Meeting--has two approaches to the audio element. You can route it over the data network, or you can route it over the conventional telephone system (”PSTN” or “Public Switched Telephone Network”).

In principle, it’s much better to treat the audio as just another data type for the conference and route it over a data network. This makes things much easier for users, because you can have tight integration with the conferencing application. No remembering and entering phone numbers, for example. And usually cheaper.

Internally, companies often run the audio part of conferencing over their in-house data network, because their internal network is fast enough and delivers data in a timely way. However, where participants must connect over public data networks, notably the public Internet, it’s common to fall back on conventional telephony, because voice transmission over public data network connections often is not quite good enough for the audio part of a conferencing session (even though most of us have experienced pretty good voice quality at times via Skype).

There are two reasons for this audio-over-Internet shortfall:

  • Voice over IP technologies like Skype convey voices much less well when several people are chiming in and even talking over one another--as quite often happens in a conferencing session — than they do for a one-to-one conversation
  • Conferencing behavior is very sensitive to slight delays in delivering the voice signal voice, known as “latency”. In the public Internet, there’s lots of latency, long and variable, as individual packets of data are delivered via widely varying routes through the network. For a discussion via electronic conferencing, that means all those subtle unwritten rules--about when and how it’s OK, or not OK, to chip in and interrupt--get really disrupted. Users find that very unsettling

When will the public Internet be good enough so that it’s the default for the audio element of conferencing? Our current guess is around 2014 in most rich countries. In other places, probably quite a bit longer than that. Could be 2040 in some places.

However, that only applies to users who depend on the public Internet, with its best efforts/no promises level of Quality of Service (QoS). More and more businesses are buying Wide Area Ethernet services from telephone companies. Those are IP networks, like the public Internet, but unlike the public Internet they use quality-enhancing techniques like MPLS that reduce latency, lost packets and other QoS problems. Users of such services will likely be able to realize the full benefits of routing both the voice and data parts of a conferencing session together over a data network, long before it becomes the usual method for users who rely entirely on the public Internet.

David Ferris, with thanks to Michael Tyler for his helpful input

Today, PC-based conferencing vendors such as Cisco, IBM/Lotus, and Microsoft tend to focus on PC-centric innovation. Eg., support for multi-person video cameras, integration with PBXs, integration with desktop applications.

However, I think the biggest innovations, starting perhaps in 2012, will be found among mobile phones. These will have sufficient bandwidth and computing power to compete with desktops. However:

  • Their user interfaces are severely constrained
  • They are always with the user
  • They have, or will have, location awareness
  • They will have plenty of other very innovative aspects

Thus the exciting conferencing developments will, several years hence, turn around mobile phones.

The phones will remain small so that they can easily fit in a pocket. Obviously they’ll continue to get thinner. Innovation--in ways we don’t imagine--will help us get by with small screens. Presumably, videocameras will often be built in.

David Ferris

Desktop sharing is a useful component of conferencing technology. Its main uses are:

  • Document sharing, so people can work on spreadsheets, word processing documents, and so on, at the same time
  • Informal peer-to-peer help desks and training. Where you show someone how to do something

David Ferris

PS. I’m not sure I’m saying anything useful here! But I hope someone finds this useful.