Package weka.clusterers
Class FarthestFirst
- java.lang.Object
-
- weka.clusterers.AbstractClusterer
-
- weka.clusterers.RandomizableClusterer
-
- weka.clusterers.FarthestFirst
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,Clusterer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class FarthestFirst extends RandomizableClusterer implements TechnicalInformationHandler
Cluster data using the FarthestFirst algorithm.
For more information see:
Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.
Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.
Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it BibTeX:@article{Hochbaum1985, author = {Hochbaum and Shmoys}, journal = {Mathematics of Operations Research}, number = {2}, pages = {180-184}, title = {A best possible heuristic for the k-center problem}, volume = {10}, year = {1985} } @inproceedings{Dasgupta2002, author = {Sanjoy Dasgupta}, booktitle = {15th Annual Conference on Computational Learning Theory}, pages = {351-363}, publisher = {Springer}, title = {Performance Guarantees for Hierarchical Clustering}, year = {2002} }
Valid options are:-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Version:
- $Revision: 5538 $
- Author:
- Bernhard Pfahringer (bernhard@cs.waikato.ac.nz)
- See Also:
RandomizableClusterer
, Serialized Form
-
-
Constructor Summary
Constructors Constructor Description FarthestFirst()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClusterer(Instances data)
Generates a clusterer.int
clusterInstance(Instance instance)
Classifies a given instance.Capabilities
getCapabilities()
Returns default capabilities of the clusterer.int
getNumClusters()
gets the number of clusters to generatejava.lang.String[]
getOptions()
Gets the current settings of FarthestFirstjava.lang.String
getRevision()
Returns the revision string.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.String
globalInfo()
Returns a string describing this clustererjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
Main method for testing this class.int
numberOfClusters()
Returns the number of clusters.java.lang.String
numClustersTipText()
Returns the tip text for this propertyvoid
setNumClusters(int n)
set the number of clusters to generatevoid
setOptions(java.lang.String[] options)
Parses a given list of options.java.lang.String
toString()
return a string describing this clusterer-
Methods inherited from class weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.clusterers.AbstractClusterer
distributionForInstance, forName, makeCopies, makeCopy
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this clusterer- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the clusterer.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Specified by:
getCapabilities
in interfaceClusterer
- Overrides:
getCapabilities
in classAbstractClusterer
- Returns:
- the capabilities of this clusterer
- See Also:
Capabilities
-
buildClusterer
public void buildClusterer(Instances data) throws java.lang.Exception
Generates a clusterer. Has to initialize all fields of the clusterer that are not being set via options.- Specified by:
buildClusterer
in interfaceClusterer
- Specified by:
buildClusterer
in classAbstractClusterer
- Parameters:
data
- set of instances serving as training data- Throws:
java.lang.Exception
- if the clusterer has not been generated successfully
-
clusterInstance
public int clusterInstance(Instance instance) throws java.lang.Exception
Classifies a given instance.- Specified by:
clusterInstance
in interfaceClusterer
- Overrides:
clusterInstance
in classAbstractClusterer
- Parameters:
instance
- the instance to be assigned to a cluster- Returns:
- the number of the assigned cluster as an integer if the class is enumerated, otherwise the predicted value
- Throws:
java.lang.Exception
- if instance could not be classified successfully
-
numberOfClusters
public int numberOfClusters() throws java.lang.Exception
Returns the number of clusters.- Specified by:
numberOfClusters
in interfaceClusterer
- Specified by:
numberOfClusters
in classAbstractClusterer
- Returns:
- the number of clusters generated for a training dataset.
- Throws:
java.lang.Exception
- if number of clusters could not be returned successfully
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClusterer
- Returns:
- an enumeration of all the available options.
-
numClustersTipText
public java.lang.String numClustersTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumClusters
public void setNumClusters(int n) throws java.lang.Exception
set the number of clusters to generate- Parameters:
n
- the number of clusters to generate- Throws:
java.lang.Exception
- if number of clusters is negative
-
getNumClusters
public int getNumClusters()
gets the number of clusters to generate- Returns:
- the number of clusters to generate
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClusterer
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of FarthestFirst- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClusterer
- Returns:
- an array of strings suitable for passing to setOptions()
-
toString
public java.lang.String toString()
return a string describing this clusterer- Overrides:
toString
in classjava.lang.Object
- Returns:
- a description of the clusterer as a string
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classAbstractClusterer
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv
- should contain the following arguments:-t training file [-N number of clusters]
-
-