In the evolving world of business intelligence, swift and targeted access to reports and analysis is the name of the game. But the frequent inability of employees to locate the results they need from high-end BI applications is prompting several enterprise search vendors to step in and address the challenge.
Because BI relies on data generated by accounting, sales, CRM systems and other back-end applications, it represents a lot of data. IT departments that have made substantial investments in BI packages from Cognos, Information Builders, Oracle and SAP, among others, are looking at ways to better expose that data and make it all actionable at a much faster clip. Meanwhile, the number of workers who need daily access to BI data to be more effective at their jobs steadily rises.
“A lot of things are changing in the industry to help expose more BI information,” says Frank Brooks, chief data architect at Blue Cross Blue Shield of Tennessee. “We had reached the point where we had so much BI information that it was difficult to go and find just one piece of it. So we had to counter that.”
Brooks and his team deployed IBM WebSphere Content Discovery for Business Intelligence, which in tandem with other integrated applications allows more workers to access critical BI data required for negotiating rates with various care providers and for processing claims. Rather than, say, waiting for biweekly reports and sifting through them, employees can now access a portal to search an array of applications where BI information is stored.
Brooks is one of many IT managers taking advantage of the increasing crossover between enterprise search and BI. Following news in April of Google OneBox, which extended the reach of the Google Search Appliance to BI, IBM and Microsoft announced new products and features for customers who want to marry search functionality with BI to get real-time business analytics into the hands of more employees. In May, Fast Search and Transfer joined its Enterprise Search Platform with Cognos 8 Business Intelligence solution to deliver corporate content directly to workers who are not necessarily sophisticated BI consumers.
According to Vinod Baya, director at PricewaterhouseCoopers’ Technology Centre in San Jose, corporate users today are having difficulty getting to BI data due to three principal problems: “They aren’t aware that a BI report exists for the analysis they need; or if they know it exists, they can’t find it; or they can find it, but it doesn’t contain all the information they need.” Enterprise search, he says, can help with all three pain points.
Getting to the data
On many BI systems, the reports are designed by analysts versed in the software package’s report writer. These reports are catalogued as templates and generally run on a recurring basis, such as month-end. Then, the resulting documents are distributed to specific mailing lists of users.
The problem of finding the data in such a situation is two-fold for any user who isn’t on the regular mailing list. Firstly, how do you know if the report even exists? And secondly, if the report is known to exist, how do you access it? The latter problem is especially common because reports often sit on file servers where they are assigned cryptic names by the BI software.
Without an enterprise search engine that can locate the report or its underlying data, a user has few ways of getting the information. This situation leads to undesirable results: the employee forgoes the search or expends significant effort culling the data from other sources and re-creating report. In the latter case, this results in duplication of effort and the risk that two reports that purport to present the same data have differing figures. Even when users can find reports, the documents will often lack the desired data. And because the reports are template driven, users cannot easily modify the reports to provide different data.
Regulatory compliance is also driving the need for BI search. Compliance officers need to be able to search through CRM databases and email stores, for example, looking for dangerous phrases such as “We guarantee” or “I shouldn’t be telling you ...”
How BI search is different
One way that search makes it easier to extend access to BI is that users already know how to employ it due to their familiarity with web-based search engines. With little training, users can be shown how to use additional options, similar to those in the Advanced Search features on the web engines.
What happens behind the scenes in an enterprise search, however, varies significantly from the operation of web search engines. Most web queries today target unstructured data, such as HTML, PowerPoint presentations and PDF files. Because these resources have a document orientation, the engine can make intelligent decisions about the meaning and the relevance of the data. Web pages even have specific tags to facilitate this process.
Structured data, by contrast, does not generally provide this contextual information. Open a database and read a column of figures called “part” and you have very little knowledge of what that number refers to (part number, cost, inventory, location, among other things). As Baya points out, “this problem will eventually be solved by use of metadata, and this is already happening via the support for XML in databases. But with regard to the vast majority of structured data today, there is no easy solution.”
BI software solves this problem in part by the use of templates and the definition of data relationships by trained analysts. Because of this, many search enterprise engines today — such as Google and X1 — hand off searches of structured data to the BI software and then federate (that is, combine) the results with items from their own search index.
Unstructured data has its own challenges. The first is pure volume. As Mark Andrews, programme director of IBM’s Information Management Strategy points out, a typical business user will deal with 70 emails per work day (including receiving and sending). In a company of 25,000 employees, that’s nearly half a billion emails per year that must be stored (for compliance purposes) and made searchable. Add to this all the other documents (HTML, word processing, spreadsheets and presentations) and you have a tremendous capacity issue that translates itself into another challenge. With many searches turning up thousands of results, how do you rank the results for relevance?
As Matthew Glotzbach, head of pro-ducts at Google Enterprise, observes, “Unlike web searches, you don’t typically have spamming sites that are trying to fool your algorithms, but you also don’t have a large set of usage data to guide you.” Google doesn’t reveal its algorithms, but it does try to establish the “authoritativeness” of specific entries.
IBM, which is more forthcoming about its algorithms, uses a blend of weighting factors for relevance in its enterprise search. These include: user click patterns, the format and position of an entry in a document (headings have higher relevance than in-text entries), metadata (so that text in a link will be ranked differently than similar text in the body of a document) and so on.
Most products today provide a way of increasing relevance of certain documents or URLs so that they occupy first place in a given search. (For example, a query on “sexual harassment” can be tweaked so that the company’s policy is always the first item returned.) In addition, many products enable customisation for company-specific lingo. This permits search engines to know, for example, that a query regarding “Region 1” refers to the Eastern US seaboard.
Proceed with security in mind
Access control is a central aspect of BI search. This problem occurs in two directions: how does an employee access all the needed data for a report and how is an employee blocked from seeing confidential data? In a perfect world, single-sign-on would address the first issue, and access to a directory LDAP server would resolve the second. The problem is in the implementation: much of the data is located on systems whose access control is not tightly neatly defined by a corporate-wide access mechanism.
The problem is actually worse than it appears. Says Maxime Tiran, an engineer in IBM’s Data Management division, “When company IT departments set up enterprise-wide searching tools, they are frequently horrified by the kinds of confidential data that is widely accessible and completely unprotected on their intranets.”
Security schemes vary, and sites contemplating adding search to their BI need to determine how access control is handled by the products they’re considering. Many products simply pass the user credentials to the BI package or other back-end software and rely on those applications to limit the returned results according to their built-in access mechanisms. This aspect is a particular strength of Oracle’s Secure Enterprise Search product.