Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.NullPointerException #160

Open
shabir1 opened this issue Jun 11, 2019 · 0 comments
Open

java.lang.NullPointerException #160

shabir1 opened this issue Jun 11, 2019 · 0 comments

Comments

@shabir1
Copy link

shabir1 commented Jun 11, 2019

data sample:
Document1 label1 forest=3.4 tree=5 wood=2.85 hammer=1 colour=1 leaf=1.5
Document2 label2 forest=10 tree=5 wood=2.75 hammer=1 colour=4 leaf=1

String lineRegex = "^(\S*)[\s,](\S)[\s,](.)$";
String dataRegex = "[\p{L}([0-9]*\.[0-9]+|[0-9]+)_\=]+";
ArrayList pipeList = new ArrayList();
pipeList.add(new Target2Label());
pipeList.add( new Input2CharSequence() );
pipeList.add( new CharSequence2TokenSequence(Pattern.compile(dataRegex)) );
pipeList.add( new TokenSequenceParseFeatureString(true,true,"=") );
pipeList.add( new PrintInputAndTarget());
InstanceList instances = new InstanceList (new SerialPipes(pipeList));
Reader fileReader = new InputStreamReader(new FileInputStream(new File(dataPath)),
"UTF-8");
instances.addThruPipe(new CsvIterator (fileReader, Pattern.compile(lineRegex),
3, 2, 1));

ClassifierTrainer trainClassify = new NaiveBayesTrainer();
trainClassify.train(instances);

.
.
.
.
name: 1419
target: +adwapq-50k
input: TokenSequence [CapitalGain=0.0 span[0..15], education=5 feature(education)=5.0 span[16..27], occupation=0 span[28..40], race=0 span[41..47], sex=1 feature(sex)=1.0 span[48..53], capitalLoss=0.0 span[54..69], HoursPerWeek=40.0 feature(HoursPerWeek)=40.0 span[70..87], fnlwgt=115070.0 feature(fnlwgt)=115070.0 span[88..103], MaritalStatus=0 span[104..119], NativeCountry=0 span[120..135], workclass=2 feature(workclass)=2.0 span[136..147], relationship=0 span[148..162], age=47.0 feature(age)=47.0 span[163..171], EducationNum=10.0 feature(EducationNum)=10.0 span[172..189]]
Token#0:CapitalGain=0.0 span[0..15]
Token#1:education=5 feature(education)=5.0 span[16..27]
Token#2:occupation=0 span[28..40]
Token#3:race=0 span[41..47]
Token#4:sex=1 feature(sex)=1.0 span[48..53]
Token#5:capitalLoss=0.0 span[54..69]
Token#6:HoursPerWeek=40.0 feature(HoursPerWeek)=40.0 span[70..87]
Token#7:fnlwgt=115070.0 feature(fnlwgt)=115070.0 span[88..103]
Token#8:MaritalStatus=0 span[104..119]
Token#9:NativeCountry=0 span[120..135]
Token#10:workclass=2 feature(workclass)=2.0 span[136..147]
Token#11:relationship=0 span[148..162]
Token#12:age=47.0 feature(age)=47.0 span[163..171]
Token#13:EducationNum=10.0 feature(EducationNum)=10.0 span[172..189]

name: 1420
target: +adwapq-50k
input: TokenSequence [CapitalGain=0.0 span[0..15], education=5 feature(education)=5.0 span[16..27], occupation=11 feature(occupation)=11.0 span[28..41], race=0 span[42..48], sex=0 span[49..54], capitalLoss=0.0 span[55..70], HoursPerWeek=50.0 feature(HoursPerWeek)=50.0 span[71..88], fnlwgt=172582.0 feature(fnlwgt)=172582.0 span[89..104], MaritalStatus=0 span[105..120], NativeCountry=0 span[121..136], workclass=5 feature(workclass)=5.0 span[137..148], relationship=3 feature(relationship)=3.0 span[149..163], age=19.0 feature(age)=19.0 span[164..172], EducationNum=10.0 feature(EducationNum)=10.0 span[173..190]]
Token#0:CapitalGain=0.0 span[0..15]
Token#1:education=5 feature(education)=5.0 span[16..27]
Token#2:occupation=11 feature(occupation)=11.0 span[28..41]
Token#3:race=0 span[42..48]
Token#4:sex=0 span[49..54]
Token#5:capitalLoss=0.0 span[55..70]
Token#6:HoursPerWeek=50.0 feature(HoursPerWeek)=50.0 span[71..88]
Token#7:fnlwgt=172582.0 feature(fnlwgt)=172582.0 span[89..104]
Token#8:MaritalStatus=0 span[105..120]
Token#9:NativeCountry=0 span[121..136]
Token#10:workclass=5 feature(workclass)=5.0 span[137..148]
Token#11:relationship=3 feature(relationship)=3.0 span[149..163]
Token#12:age=19.0 feature(age)=19.0 span[164..172]
Token#13:EducationNum=10.0 feature(EducationNum)=10.0 span[173..190]

java.lang.NullPointerException
at cc.mallet.types.Multinomial$Estimator.setAlphabet(Multinomial.java:308)
at cc.mallet.classify.NaiveBayesTrainer.setup(NaiveBayesTrainer.java:251)
at cc.mallet.classify.NaiveBayesTrainer.trainIncremental(NaiveBayesTrainer.java:200)
at cc.mallet.classify.NaiveBayesTrainer.train(NaiveBayesTrainer.java:193)
at cc.mallet.classify.NaiveBayesTrainer.train(NaiveBayesTrainer.java:59)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant