Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for methods #17

Merged
merged 9 commits into from
Jun 30, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 92 additions & 1 deletion lib/statsample-glm/glm/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,21 @@ def initialize ds, y, opts={}
.const_get("#{algorithm}").const_get("#{method}")
.new(@data_set, @dependent, @opts)
end


# Returns the coefficients of trained model
#
# @param [Symbol] as_a Specifies the form of output
#
# @return [Vector, Hash, Array] coefficients of the model
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, {constant: 1}
# glm.coefficients as_a = :hash
# # =>
# # {:x1=>-0.3124937545689041, :x2=>2.286713333462646, :constant=>0.675603176233328}
#
def coefficients as_a=:vector
case as_a
when :hash
Expand All @@ -49,6 +63,23 @@ def coefficients as_a=:vector
end
end

# Returns the standard errors for the coefficient estimates
#
# @param [Symbol] as_a Specifies the form of output
#
# @return [Vector, Hash, Array] standard error
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, {constant: 1}
# glm.standard_error
# # #<Daru::Vector:25594060 @name = nil @metadata = {} @size = 3 >
# # nil
# # 0 0.4130813039878828
# # 1 0.7194644911927432
# # 2 0.40380565497038895
#
def standard_error as_a=:vector
case as_a
when :hash
Expand All @@ -70,18 +101,78 @@ def iterations
@regression.iterations
end

# Returns the values predicted by the model
#
# @return [Vector] vectors of predicted values
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, constant: 1
# glm.fitted_mean_values
# # =>
# # #<Daru::Vector:27008600 @name = nil @metadata = {} @size = 50 >
# # nil
# # 0 0.18632025624516532
# # 1 0.5146459448198846
# # 2 0.84083523282549
# # 3 0.9241524337773334
# # 4 0.7718528863631826
# # ... ...
#
def fitted_mean_values
@regression.fitted_mean_values
end

# Returns the residual for every data point
#
# @return [Vector] all residuals in a vector
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, {constant: 1}
# glm.residuals
# # #<Daru::Vector:22263420 @name = y @metadata = {} @size = 50 >
# # y
# # 0 -0.18632025624516532
# # 1 -0.5146459448198846
# # 2 0.15916476717451
# # 3 -0.9241524337773334
# # 4 0.2281471136368174
# # ... ...
#
def residuals
@regression.residuals
end

# Returns the degrees of freedom value.
#
# @return [Integer] the degrees of freedom
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, constant: 1
# glm.degree_of_freedom
# # => 47
#
def degree_of_freedom
@regression.degree_of_freedom
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found at most of the places that it's named as "degrees of freedom" and not "degree of freedom". Should we change it?

Copy link
Member

@v0dro v0dro May 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a question for @agisga. Is it referred to as degree or degrees in the world of statistics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But be sure to not break backward compatibility even if you rename the method (alias it)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think, the plural "degrees" is more correct.

Similarly, maybe we should change #standard_error to the plural form #standard_errors, because it returns a vector of standard errors rather than just one value. What do you guys think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with it. Just be sure to make the appropriate aliases to not break backwards compatibility. We'll remove the alias after 2-3 releases.

end

# Returns the optimal value of the log-likelihood function when using MLE algorithm.
# The optimal value is the value of the log-likelihood function at the MLE solution.
#
# @return [Numeric] the optimal value of log-likelihood function
#
# @example
# require 'statsample-glm'
# data_set = Daru::DataFrame.from_csv "spec/data/logistic.csv"
# glm = Statsample::GLM.compute data_set, "y", :logistic, constant: 1, algorithm: :mle
# glm.log_likelihood
# # => -21.4752278175261
#
def log_likelihood
@regression.log_likelihood if @opts[:algorithm] == :mle
end
Expand Down