Data Mining Terms:
Listed below are the principal objects and terms used in the JDM
specification.
Connection:
Users can get a connection object to access the DME by using a
connection factory. The connection object also provides access
to the objects present in the metadata repository(MR). Connection
objects can create, retrieve and delete mining objects present
in the MR.
Task:
A task object is used to define tasks that need to be performed
by the DME. The JDM defines tasks to build, to apply models and
to test. Additionally it provides tasks to compute statistics,
importing and exporting data. Tasks can be grouped as batches
and can be scheduled for execution by the application.
Execution Handle and Status:
An asynchronous task produces an execution handle that can be
used to track executing tasks or completed tasks. An execution
handle also provides a mechanism to block task until it is
completed thereby making it a synchronous task. An execution
handle can also be used to terminate an executing task, however
the actual implementation is left to the vendor who implements
the JDM specification.
Physical Data Set:
Physical data set is the actual data that is being used as input
in the data mining operation. The physical data set object can
represent relational tables, star schemas, structured files, XML
files and olap cubes. In the first release of the specification
only table and file datas are supported.
Physical Data Record:
Physical data records are used for single case scoring both in
input and output operations.
Build Settings:
Build settings are used to set the parameters required for
building the model. Build settings allow users to specify the
algorithm to be used for building the model. They have default
values and they are used if the user omits parameters.
Algorithm:
An algorithm is applied to a set of data to produce a model. JDM
does not define a large number of algorithms but provides
mechanisms to add new ones. An algorithm can optionally have a
setting that can be used for setting parameters for an
algorithm.
Algorithm settings:
Algorithm settings are used to set parameters for a particular
algorithm that is selected. This helps in fine tuning the
algorithm.
Model:
A model is produced when an algorithm is applied to a set of
data. In the first release models would be read only and will be
stored in the metadata repository. A model is specific to the
algorithm used to create it and is related to the task that
created the model.
Model Signature:
A model signature defines the input parameters required to use
the model. The signature consists of attributes like name, data
type, and type.
Model Detail:
A model detail object represents the detailed state of a model.
The details are specific to the algorithm used and changes when
the model or algorithm changes.
Attribute Statistics set:
An attribute statistics set contains information about
statistics on a set of attributes. It is created when computing
statistics are done on a data set object.
Confusion matrix:
A Confusion matrix is produced when a model is being tested. It
tells the user of the model on how well the model is doing in
predicting values and where it is making mistakes.
Lift:
Lift calculates the ratio between the results computed with and
without the predictive model. Lift provides a measure of success
of the predictive model.
Cost matrix:
A cost matrix defines a tabular representation of the cost
associated with predicted values and actual values.
New on the Java Boutique:
New Review:
Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling
API boasts simplicity, ease-of-integration, a well-rounded feature
set, and it's free!
New Applet:
Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA
sequences into three useful formats.
Elsewhere on internet.com:
WebDeveloper Java
Lots of Java information on webdeveloper.com
WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.
ScriptSearch Java
Hundreds of free Java code files to download.
jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.
|