Tuesday, August 6, 2013

MAD Practices: Radical Departure from Traditional EDW & BI

"If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap.
So, what's getting ubiquitous and cheap? Data.
And what is complementary to data? Analysis."
                                                       - Prof. Hal Varian, UC Berkeley, Chief Economist at Google


Thursday, June 13, 2013

The Rise of Dashboards in Banking, Telecommunications Industries| Contents Inside Dashboards


The term “dashboard” is as overused today as it is misunderstood. What exactly are dashboards? Are they just screens with fancy charts and graphics on them, or do they serve a greater purpose? The easiest way to understand what role dashboards play in the industry is to examine some of the questions that professionals are asking.


  • How do we proactively prevent customer attrition? 
  • What are the top and most current reasons that customers are leaving, and how might we respond to these dynamics? 
  • Can we proactively spot trends and possible issues in our billing system that could have an impact on churn?

Tuesday, February 19, 2013

A Discussion on Green Computing

"Each PC in Use Generates  about a Ton of Carbon  dioxide Every year" 

We are passionate about advances in and widespread adoption of IT. However, IT has been contributing to environmental problems, which most people don’t realize.To reduce IT’s environmental problems and to create a sustainable environment, we call upon the IT sector as well as every computer user to green their IT systems, as well as the way they use these systems. We are legally, ethically, and socially required to green our IT products, applications, services, and practices.

Green IT is a hot topic today and will continue to be an important issue for several years to come.
Green : A Computer's Entire Life-cycle

Monday, February 18, 2013

Density Based Clustering Algorithm: DBSCAN with Implementation in MATALB

Density Based Clustering Algorithm locates regions of high density that are separated from one another by regions of low density. DBSCAN is a center based approach to clustering in which density is estimated for a particular point in the data set by counting the number of points within the specified radius, ɛ, of that point.
The center based approach to density allows us to classify a point as one of the three:

     Core points: These points are in the interior of the dense region
   Border points:These points are not the core points, but fall within the neighborhood of the core points
   Noise points: A noise point is a point that is neither a core point nor a border point.
        The formal definition of DBSCAN algorithm is illustrated below:

Saturday, February 16, 2013

Algorithm Analysis: The Basic K-means Clustering Algorithm

The basic k-means clustering algorithm is a simple algorithm that separates the given data space into different clusters based on centroids calculation using some proximity function. Using this algorithm, we first choose the k- points as initial centroids and then each point is assigned to a cluster with the closest centroid. The algorithm is formally described as follows:
Input: A data set D containing m objects (points) with n attributes in an              Euclidean space
Output: Partitioning of m objects into k-clusters C1, C2, C3, …, Ck, i.e. Ci D and Ci ∩ Cj = ᶲ  (for 1 ≤ i, j ≤ k)

Sunday, February 10, 2013

The Hour-Glass Model of Grid Computing Architecture

The term “the Grid” was coined in the mid-1990s to denote a (then) pro-posed distributed computing infrastructure for advanced science and engineering. A key issue in a grid computing system is that resources from different organizations are brought together to allow the collaboration of a group of people or institutions. Such a collaboration is realized in the form of a virtual organization (VO).

[I] “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” – Foster & Kesselman, 1998

[II] “Grid computing is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose.”                                                           - Foster & Tuecke, 2000

Grid Computing Architecture
The hour-glass model of grid computing architecture as proposed by Dr. Ian Foster (2001) consists of thin center, wide top and wide bottom with layered combinations.

Figure 2: Layered Architecture of Grid Computing System

Thursday, February 7, 2013

Support Vector Machine Approach for Detecting Credit Card Frauds

Financial fraud is increasing significantly with the development of modern technology and the global superhighways of communication, resulting in the loss of billions of dollars worldwide each year. The companies and financial institution loose huge amounts due to fraud and fraudsters continuously try to find new rules and tactics to commit illegal actions. Thus,  fraud detection systems have become essential for all credit card issuing banks to minimize their losses.  The most commonly used fraud detection methods are Neural Network (NN), rule-induction techniques, fuzzy system, decision trees, Support Vector Machines 
(SVM), Artificial Immune System (AIS), genetic algorithms, K-Nearest Neighbor algorithms.

The detection of fraud is a complex computational task and still there is no system that surely predicts any transaction as fraudulent. They just predict the likelihood  of the transaction 
to be a fraudulent.
The properties of a good fraud detection system are:
1) It should identify the frauds accurately
2) It should detecting the frauds quickly
3) It should not classify a genuine transaction as fraud