Skip to content

better clarify how to do prediction in documentation #92

@dustinvtran

Description

@dustinvtran

from an e-mail exchange with @drizzersilverberg:

  1. Can 'sgd' package perform linear regression task to build a model and predict a set of data using the built model?

  2. If so, can you show me the way (code/script) to build the model and predict simple data? as an example, you can use this house_price data.

  3. Last, can 'sgd' compute the prediction error using mean-square-error/root-mean-square-error? If > so, can you show me the way (code/script) to use it in above example?

my reply:

  1. the main function sgd will estimate parameters for a chosen model such as linear regression. there are utility functions to handle the tasks post-estimation. for example, the predict function takes the output of sgd and test covariates/features as input; the output is the predicted response. (see ?predict.sgd)

  2. following the example on the README.md (https://github.com/airoldilab/sgd/blob/master/NAMESPACE), you can do something like:

X_test <- matrix(rnorm(50*d), ncol=d)
y_hat <- predict(sgd.theta, cbind(X_test, 1))

  1. to get the numerical values, you have to do it manually; for example, we have a demo for mean-squared error over the parameters (see ?sgd):

sprintf("Mean squared error: %0.3f", mean((theta - as.numeric(sgd.theta$coefficients))^2))

however, we also have plots that can do MSE or classification error in predictions. (see ?plot) including the numerical output would be as simple as exposing the utility functions we wrote there; we chose not to in order to force the user to not rely on helper functions.

i think these could be made more explicit in the documentation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions