Monday, January 21, 2013

Hurdles #2 - Apache Pivot - Finding Named Objects

My second Hurdles article is continuing with Apache Pivot, and the topic of this article is finding named objects which means if there is a container or a control somewhere in your window that has a name or bxml:id set, how can I find a reference to that object using the identifying attribute?

My specific example is that I have a Dialog object that contains a TextInput. When the Dialog is closed I want to find that TextInput control, read the Text value that has been provided, and act on it. To do this I have a very simple setup, inside my dialog I have created a TablePane to structure my layout and within the TablePane I have a Label, the TextInput (with an attached validator) and two buttons, a "Submit" button that on a ButtonPressed event does "dialog.close(true)" and a "Cancel" button that does "dialog.close(false)". I have also configured a DialogCloseListener in code that will process the close event, check to see if the Dialog has a result, and perform an action with the TextInput value.


I was eventually able to find two solutions to this problem, the preferred solution would depend on the situation and specific implementation, but I will present both solutions here. There may be additional solutions to this particular problem that I am not aware of, but my goal here was to get an object reference with minimal code and in a generic fashion.

Option 1: Named Component Traversal
Unfortunately in Apache Pivot container tree traversal is not as natural, convenient, or consistent as I expected. It certainly is not as powerful as an XML DOM parser or as a Java File object. Unless you are using position-based object location, the component traversal has a few requirements:
  1. Every component in the XML tree must have a name attribute set (although name is not a required attribute)
  2. The name attribute must be unique among the set of children for a common parent
  3. Each node must be traversed in sequence from parent to child to find the intended descendant, there does not appear to be any kind of path-definition or recursive lookup available
  4. getNamedComponent returns a Component object which does not have getNamedComponent as a method. This method is in the Container subclass of Component so each traversal step requires at least a Cast operation. Because there does not appear to be any kind of "getAllChildren" method, I do not know if there is any way to do a tree exploration or blind traversal (which would require reflection as well as a Cast operation)
So given the following BXML structure:
<Dialog bxml:id="dialog" title="Dialog" modal="true"
    xmlns:bxml="http://pivot.apache.org/bxml"
    xmlns="org.apache.pivot.wtk">
    <TablePane name="table">
        <columns>
            <TablePane.Column width="1*"/>
        </columns>

        <TablePane.Row height="1*">
            <Label text="Enter number:"
                styles="{horizontalAlignment:'center', verticalAlignment:'center'}"/>
            <TextInput text="0" name="numberInput" bxml:id="
numberInput">
                <validator>
                    <IntValidator xmlns="org.apache.pivot.wtk.validation"/>
                </validator>
            </TextInput>
        </TablePane.Row>

        <TablePane.Row height="-1">
            <PushButton buttonData="Submit"
                ButtonPressListener.buttonPressed="dialog.close(true)"/>
            <PushButton buttonData="Cancel"
                ButtonPressListener.buttonPressed="dialog.close(false)"/>
        </TablePane.Row>
    </TablePane>
</Dialog>
The code required to locate the "numberInput" TextInput may look something like the following:
dialog.open(window,
   new DialogCloseListener() {
       public void dialogClosed(Dialog arg0, boolean arg1) {
           if(arg0.getResult()) {
               TablePane tp = (TablePane)arg0.getNamedComponent("table");
               TextInput ti = (TextInput)tp.getNamedComponent("numberInput")
               System.out.println(ti.getText());
           }
       }
   });
Option 2: BXMLSerializer Lookup
The BXMLSerializer approach is the polar opposite of the traversal approach. This approach also has a uniqueness constraint aspect to it but it is supported by the framework because violation of this constraint will result in a SerializationException being thrown.

The BXMLSerializer requires that your target component has a bxml:id attribute set. All components with a bxml:id attribute get deposited into the Namespace map of the definition file that was processed by the Serializer. However it requires that a reference to the BXMLSerializer instance that was used to parse the BXML file must be kept, and it also must be accessible to the appropriate Handler/Listener that needs to use it.

Taking the example BXML file in Option 1 the following code could be used to access the TextInput control:
private BXMLSerializer bxmlSerializer;
 

public void startup(Display display, Map<String, String> properties)
        throws Exception {
        bxmlSerializer = new BXMLSerializer();
           
        Dialog dialog = (Dialog)bxmlSerializer.readObject(Main.class, "bxml/dialog.bxml");
        dialog.open(window,
            new DialogCloseListener() {
                public void dialogClosed(Dialog arg0, boolean arg1) {
                    if(arg0.getResult()) {
                        TextInput ti = (TextInput)bxmlSerializer.getNamespace().get("numberInput");
                        System.out.println(ti.getText());
                    }
                }
            });      
    }

Note that in this sample there is no hierarchy connection between the Dialog itself and the "numberInput" control, however Pivot provides a convenient way to reverse the process as it provides both "getAncestor" and "getParent" methods in the Component class that allow quick traversal up the tree once you have figured out how to get the child.

If you have an alternate method to access an arbitrary component within a window that is an improvement to any of the methods described here, please send me an email. My approaches described above were learned through trial and error because specific documentation on how to do this was lacking online and if there are any better approaches I will post them here as a follow-up.

Wednesday, January 16, 2013

Gotcha - Teradata Nested Views & Functions

Here's another interesting "gotcha" involving Teradata v13.1 and how it handles metadata for views, this time when using multiple layered views and a custom function. We encountered this error when creating a summary-style view in our data warehouse to push data aggregation into the database layer instead of building the query in Cognos Framework Manager.

The exact Teradata error that we received while attempting to create a new View was:
REPLACE VIEW Failed. 3822: Cannot resolve column 'DATE_ENROLLED'. Specify table or view.
A very generic error message, but what is unique about this particular column in the View is that it is the only one that is being operated on by a function.

The structure of the View query is (very abbreviated) as follows:
REPLACE VIEW view_db.new_view
(DAY_KEY)
AS
SELECT
    COMMON_DB.CONVERT_TIMESTAMP_TO_KEY(DATE_ENROLLED) AS DAY_KEY
FROM
view_db.person
The query by itself runs without any problems and returns the correct results. But as soon as I put it into a CREATE or REPLACE VIEW statement it failed. Please note that the source of the query (view_db.person) is a View as well that merges records together from a source table (data_db.person) to produce an accurate list of current people.

Now during our investigation we discovered that replacing the source of the query (view_db.person) by referencing the source table of that view (data_db.person) we did not see the error anymore. The column name was unaffected because DATE_ENROLLED is a field in both the View and Table that is unmodified. However this was not a solution because it defeated the purpose of building the view_db.person View in the first place, and shoe-horning the query into a sub-select would be very complex and nearly unmaintainable.

So the structure of our new view and the column source was structured as follows:
view_db.new_view [DAY_KEY]
  - COMMON_DB.CONVERT_TIMESTAMP (function)
      - view_db.person [DATE_ENROLLED]
          - data_db.person [DATE_ENROLLED]
The Solution

We suspect that the Teradata database is encountering problems resolving the source of the DATE_ENROLLED column because of the identical naming between the view_db.person and the data_db.person which is confounded by applying the function call.

We were able to resolve the issue by creating a special sub-select query on the view_db.person that has the sole purpose of renaming the DATE_ENROLLED column.

The resulting fixed REPLACE VIEW statement is as follows:
REPLACE VIEW view_db.new_view
(DAY_KEY)
AS
SELECT

  COMMON_DB.CONVERT_TIMESTAMP_TO_KEY(DATE_ENROLLED_VIEW) AS DAY_KEY
FROM
  (SELECT DATE_ENROLLED AS DATE_ENROLLED_VIEW FROM view_db.person) a
This allowed the DATE_ENROLLED column to be correctly resolved and the view to be created successfully.

Tuesday, January 15, 2013

Hurdles #1 - Apache Pivot - BXML Text Validators

Learning new technologies, frameworks, and languages can be hard. This is especially true when working alone from online documentation when the subject matter is either new, not widely adopted, or the documentation is a work in progress. Minor hurdles can derail the best intentioned learner because the solution is so blindingly obvious to anyone with knowledge of the subject that a solution is never stated.

I have started this post series titled "Hurdles" to track the minor, obvious, but frustrating issues I encounter when learning new things. Perhaps someone, sometime will find one of these posts useful, but if not it will at least be a log of lessons learned.

This article is about Apache Pivot, an open-source Java UI library specifically targeted towards creating rich interface applications for the web or standalone, along with a number of supporting libraries that simplify things like creating REST-ful services. After reading up on Pivot it intrigued me enough to give it a try to evaluate it and perhaps find a use for it on personal projects or at work.

My first impressions, the BXML definition and binding structures have strong flavours of WPF. There are significant differences of course, but the feel of familiarity and the relative ease of putting together a simple application based on the tutorials that had a rich interface layer and used simple web services gave me a good first impression.

The first real hurdle came when I was designing my first input form and wanted to attach a validator to a TextInput control using BXML. The APIs make attaching a validator a trivial exercise in Java code, the TextInput object contains a method called "setValidator" and taking a "org.apache.pivot.wtk.validation.Validator" type. The available validators themselves are simple but varied and selecting an IntValidator for this task was easy to do.

However, adding a new IntValidator instance into a TextInput tag in BXML was not as simple as it seemed. I tried a variety of [validator="obj"] attributes, using dereference, parameter, and variable syntax. All I got for my trouble was a heap of error messages and invalid cast exceptions.

I eventually found my solution in the Apache Pivot - Users nabble forum (for reference, here is the link to the Apache Pivot - Developers forum too) in a topic titled "Hi,   ". The conclusion to my problem was to create a child tag for the property that I wanted to set within the TextInput definition, and then create a child tag of that tag with the validator instance definition. This is the standard process for setting Collection property instances, but also applies to single Object property instances as well. Below is the simplest BXML source to illustrate the example.

<TextInput>
    <validator>
        <IntValidator xmlns="org.apache.pivot.wtk.validation"/>
    </validator>
</TextInput>

Friday, November 23, 2012

Gotcha - Teradata Views

Encountered another interesting "gotcha" again involving Teradata v13.1 and how it handles metadata for views. We encountered this issue within Cognos Framework Manager v10.1.1 when attempting to use a view created in Teradata as a query subject.

The exact Cognos error that we received was:
RQP-DEF-0177 An error occurred while performing operation 'sqlScrollBulkFetch' status='-9'.
UDQ-SQL-0107 A general exception has occurred during the operation "fetch".
[Teradata][ODBC Teradata Driver][Teradata Database] Internal error: Please do not resubmit the last request. SubCode, CrashCode:
After running a UDA trace and a Teradata ODBC driver trace and reviewing the log files we discovered a statement that was causing the error message:
HELP COLUMN "DB_NAME"."VIEW_NAME"."PK_ID_FIELD_NAME"
Running this query manually on the database gave a more detailed, but still obscure error message:
HELP Failed. 3610: Internal error: Please do not resubmit the last request. SubCode, CrashCode:
The view itself that we were debugging was extremely complex, but after some experimentation I was able to produce the following simple view definition that still caused the error.

CREATE VIEW DB_NAME.VIEW_NAME AS
SELECT
T1.FIELD1,T2.PK_ID_FIELD_NAME
FROM
DB_NAME.PARENT_TABLE T1,
DB_NAME.CHILD_TABLE T2
WHERE T1.FK_ID_FIELD_NAME = T2.PK_ID_FIELD_NAME
;

Simple right? Gotcha #2 is that this error only appeared on 2 of our 3 environments, Development and UAT showed this issue, but our SystemTest environment worked without a problem.

We were able to devise a temporary workaround, because the HELP query specifically identified a problem with the PK_ID_FIELD_NAME on the CHILD_TABLE we were able to replace it by using the FK_ID_FIELD_NAME on the PARENT_TABLE which fixed the error message. However this was not a solution to the problem, because logically retrieving the primary key of a joined table in a view should NOT cause a problem.

The Solution

The exact reason for why this problem was happening on 2 out of 3 of our systems is still unknown, we suspect there is corrupt or missing column metadata that was causing the inconsistency. Nevertheless we did find a solution to the problem.

The problem was resolved by explicitly naming the view's columns in the view definition. For whatever reason, this bypassed the metadata error and allowed the view to be used in both Cognos and Teradata SQL Assistant. Below is the fixed view definition with the changes highlighted in green:
CREATE VIEW DB_NAME.VIEW_NAME (FIELD1, PK_ID_FIELD_NAME) AS
SELECT
T1.FIELD1,T2.PK_ID_FIELD_NAME
FROM
DB_NAME.PARENT_TABLE T1,
DB_NAME.CHILD_TABLE T2
WHERE T1.FK_ID_FIELD_NAME = T2.PK_ID_FIELD_NAME
;
This allowed the HELP COLUMN metadata to be generated correctly for the view and fixed this issue without having to restructure the view query itself.

Monday, November 12, 2012

Report Conceptualization Training Strategy – Part 1, Report Visualization


For beginning and advanced report developers alike, the biggest challenge I have witnessed is the challenge of Report Conceptualization. This is the process of translating often incomplete business requirements into a structured plan that will produce the result that the business needs. For operational reports based on known data structures and calculations this process can be simple, often operational reports are structured very simply and the job of the report developer is to simply put the right fields in the right place. The process becomes much more difficult for strategic reports which attempt to help the business define what they should be doing. Strategic reports are often poorly understood by the business, their needs are uncertain and difficult to communicate, and vision for the end product is amorphous. Taking vague, conceptual requirements into a concrete view of business data is a challenge for developers and analysts alike.

The first step is understanding which are the best visualization options to meet the business requirements. Often a table of numbers is what a business unit understands and asks for because they are used to dealing with operational reports where they need numbers. But sometimes a well-designed visualization of the data makes it easier to understand and gives the business the interpretation that they need in order to make a decision without having to spend time crunching numbers.

Visualization of a data set needs to be chosen carefully to properly communicate the information that needs to be understood. A poorly chosen visualization, especially if it is poorly documented, can provide confusing, useless, and sometimes misleading information.

A couple of my favourite visualizations that do an excellent job of communicating information are:

1)  Florence Nightingale’s Diagram of the causes of mortality in the army in the East

The purpose of Nightingale’s chart was to illustrate that the primary causes of death amongst soldiers in the Crimean war were due to preventable diseases. The polar area diagram does a fantastic job of showing this, and by how much. As an aside, Nightingale could have exaggerated her diagram by choosing to use a polar radius diagram where the radial measurement is linear with the value instead of the area of the wedge. Since the eye naturally compares areas it would imply that the scale of deaths due to preventable causes was that much larger, but to her credit Nightingale strove for precise accuracy in her representations.

2)  Charles Minard’s Flow map of Napoleon’s March

Minard’s graph combines a variety of pieces of information in a novel way, representing the course of Napolean’s march geographically, as well as including the number of soldiers on both the initial march and the retreat from Moscow and the successive losses incurred, but also the temperature on the return march showing the impact of the weather on casualties.

That being said, visualization is something that needs to be developed with cooperation from the business people who are going to be using it. Even if a 100% Stacked Bar Chart is a perfect representation of the information that needs to be communicated, it is worthless if the business users do not understand what it represents and how to use it, or are trying to interpret more information from the visualization than exists because of their preconceptions about what a bar chart is.

One example is the use of two types of bar charts: Stacked Bar Chart, and 100 Percent Stacked Bar Chart. The Stacked Bar Chart can often be confused with an Area Chart as the stacked bars are misinterpreted as “overlapping” by assuming the view is a projection of the normal Standard Bar Chart from the end. Likewise the 100 Percent Stacked Bar Chart is confused with the Stacked Bar Chart assuming that the height is representative of magnitude not share, this confusion arises because the business user does not see the 100 Percent Stacked Bar Chart as analogous to a series of Pie Charts.

Remember when designing charts for business use that creative interpretation can hinder clarity. A grid of Pie charts may feel clumsy, but may do a better job of communicating effectively with your business user.

Deciding
How do you decide when to use what kind of chart?

The first step in choosing an appropriate visualization is to understand what you are trying to communicate. The purpose of a chart is to convey information about some kind of relationship between pieces of data. There is always at least two pieces of information in a chart (otherwise it is a very boring chart) and the type of relationship between those pieces of information helps determine the most effective way to display it.

What kind of relationship do you want to illustrate?

  • Do you want to draw a comparative relationship?
    Ex. How do the values of A compare to B, compare to C, over time?
  • Do you want to illustrate the existence of a relationship?
    Ex. As the values of A increase, what happens to B or C?
  • Do you want to understand a composition, how pieces make up a whole?
    Ex. How are the values of A broken down into groups B and C?
  • Do you want to show how values in a relationship are distributed?
    Ex. How often does A occur by value?
One of the better examples I have seen on how to start understanding this process is by Extreme Presentation Method and their Chart Chooser (http://www.extremepresentation.com/design/charts/). This chooser presents a nice compact decision tree to understand how certain charts are best used to explain specific relationships. It is neither perfect, nor complete, but it is an excellent starting point.

Another good resource is the Periodic Table of Visualization Methods by Visual Literacy (http://www.visual-literacy.org/periodic_table/periodic_table.html#). This resource gives a wide spectrum of visualization options and groups them by What is being visualized (Data, Information, Concept, Strategy, Metaphor, Compound), whether the visualization is Process or Structure, whether the visualization is on the Overview or Detail level or both, and whether a visualization is geared towards Convergent or Divergent thinking. It does not do a good job of explaining when or how to use any particular visualization, but it is a nice complement to the Extreme Presentation decision tree by providing a bit of context around the purpose of certain visualizations.