Sei sulla pagina 1di 12

Running head: PROBABILTY MODELING 1

Probability Modeling

Hal Hagood

u07a1
PROBABILTY MODELING 2

(Instructions) Using the case study “O” and classroom instructions the following procedure was used …

“1) In SAS Enterprise Miner, create a new project, and a new diagram within that project, then:

2) Click on Sample > File Import node to import the MS Excel file into a SAS dataset.

3) Be sure Category Gross is set to "Target" for role in the File Import node.

4) Use Text Miner > Text Parsing node to parse the data (default settings are OK)

5) Use Text Miner > Text Filter node to filter the text (default settings are OK, although you can adjust if

you wish to do any of the additional detailed steps in original tutorial from the textbook)

6) Use Text Miner > Text Cluster node to cluster the data with all default settings except the following:

SVD Resolution should be set to "High" and Max SVD Dimensions should be set to "5"

7) Run full diagram

8) Once this full diagram is run, go into the properties for the Text Cluster node, and click the ... to the

right of "Export data" and select "Explore."

9) In the "Sample Properties" window, click on "Plot..." leave it as Scatter (the first option), and click

"Next>" Scroll down in the list of variables, and click in the cell under the "Role" column next to

"TextCluster_SVD1" and select "X". Click in the cell under the "Role" column next to "TextCluster_SVD2"

and select "Y." then click "Finish." This will create a scatterplot of SVD1 vs SVD2.

10) Do the same for each combination of the 5 SVD variables, and take screenshots of each as you go,

so that you can include them in your output/assignment:

TextCluster_SVD1 vs TextCluster_SVD2, TextCluster_SVD1 vs TextCluster_SVD3, TextCluster_SVD1

vs TextCluster_SVD4, TextCluster_SVD1 vs TextCluster_SVD5, TextCluster_SVD2 vs

TextCluster_SVD3, TextCluster_SVD2 vs TextCluster_SVD4, TextCluster_SVD2 vs TextCluster_SVD5,

TextCluster_SVD3 vs TextCluster_SVD4, TextCluster_SVD3 vs TextCluster_SVD5, TextCluster_SVD4

vs TextCluster_SVD5” (courserooma.capella, 2017).


PROBABILTY MODELING 3

Produces a probability model for a text data set


PROBABILTY MODELING 4
PROBABILTY MODELING 5

Create a scatter plot of the SVD plots using a statistical software for text mining

TextCluster_SVD1 vs TextCluster_SVD2.
PROBABILTY MODELING 6

TextCluster_SVD1 vs TextCluster_SVD3.

TextCluster_SVD1 vs TextCluster_SVD4.
PROBABILTY MODELING 7

TextCluster_SVD1 vs TextCluster_SVD5.

TextCluster_SVD2 vs TextCluster_SVD3.
PROBABILTY MODELING 8

TextCluster_SVD2 vs TextCluster_SVD4.

TextCluster_SVD2 vs TextCluster_SVD5.
PROBABILTY MODELING 9

TextCluster_SVD3 vs TextCluster_SVD4.

TextCluster_SVD3 vs TextCluster_SVD5.
PROBABILTY MODELING 10

TextCluster_SVD4 vs TextCluster_SVD5

Applies a predictive model to a given text modeling context


PROBABILTY MODELING 11

Add a Decision Tree Node and link it up to the Text Cluster Node by going to Model → Decision Tree.

“Be sure to include a screenshot or download the visual representation of your Tree from the

results of the Decision Tree diagram, and explain what it means in terms of the content/text of the movies

(and their descriptions) and the original business problem of attempting to predict box office

earnings/success of movies using text mining” (courserooma.capella, 2017).


PROBABILTY MODELING 12

Reference

Courserooma.capella, (2017). PROBABILITY MODELING. Retrieved August 10, 2017 from

https://courserooma.capella.edu/webapps/blackboard/content/listContent.jsp?course_id=_49663_

1&content_id=_5183598_1&mode=reset

Viewer.Books, (2017). Practical Text Mining and Statistical Analysis for Non-structured Text Data

Applications Tutorial O - Predicting Box Office Success of Motion Pictures with Text Mining.

Retrieved August 25, 2017 from

http://viewer.books24x7.com/assetviewer.aspx?bookid=49265&chunkid=981083261&resumeboo

kmarkid=8bd103e7-f488-e711-a9c3-00505686029c#

Potrebbero piacerti anche