assignment requirements for mmis692 business intelligence capstone project target answers to questions 7 8 9 and 10 in below details dataset and additional documents to be used will be provided to successful bider
Requirements for Business Intelligence Capstone Project.
A company produces 5 types of alarm systems – P1, P2, P3, P4, and P5 – and supplies them to a retailer. It is contractually obligated to meet the demands of the retailer for each alarm system. Because of limited capacity the company may not have sufficient machining, assembly, and finishing time available to satisfy the entire demand in each period through its regular production runs. Contractual obligation requires the company to make up the shortfall in production through special production runs at higher costs. The company aims to meet the retailer’s demands at minimum cost.
LP Formulation:
Task 1:(10 Points) Formulate a linear programming (LP) model that may be solved to identify the optimal production plan for the company. |
Specifically, you must define the decision variables, objective function, and constraints in your LP model using the following parameters:
In each time period, for each product :
is the demand (number of units required) for product .
is the cost (in dollars) for producing each unit of product in a regular run.
is the cost (in dollars) for producing each unit of product in a special run.
is the machining time (in minutes) required to produce each unit of product .
is the assembly time (in minutes) required to produce each unit of product .
is the finishing time (in minutes) required to produce each unit of product .
Further, assume that:
hours of machining time are available for regular run.
hours of assembly time are available for regular run.
hours of finishing time are available for regular run.
LP Parameter Estimation:
You must now use available data to estimate the parameters of the LP formulated in Task 1.
Estimation of , , , and :
The file “production.csv†contains 7 columns: serialnbr, productcode, batchnbr, machinetime, assemblytime, finishingtime, cost.
The columns in the file may be interpreted as follows: serialnbr is a unique identifier assigned to each unit produced by the company; productcode specifies the product type; batchnbr identifies the batch in which an item is produced (items are produced in batches); machinetime, assemblytime, and finishingtime specify the time (in minutes) taken by each process to produce a unit; the last attribute, cost, specifies the cost (in dollars) of producing the unit in a regular run.
Task 2: (10 Points) Using the data from the file “production.csvâ€, compute the average machining time, assembly time, finishing time, and cost per unit for each product type as estimates of the parameters , , , and of the LP model.
|
Briefly explain how you estimated the parameters.
Specify your parameter estimates in the table below, rounded to the nearest integer.
Estimates |
P1 |
P2 |
P3 |
P4 |
P5 |
Machine Time () |
|||||
Assembly Time () |
|||||
Finish Time () |
|||||
Regular Cost () |
Estimation of special run cost :
It is known that the regular production cost is a linear function of the machining, assembly, and finishing times for each product type. That is, , where is the fixed cost incurred to produce each unit of , and , , and are respectively the costs per minute formachining, assembly, and finishing each unit of product during regular run.
Task 3: (10 Points) Run regressions to estimate the coefficients , , , and for each product . |
In your report, please explain how you obtained the data for the 5 regressions to estimate the coefficients. Then present your coefficient estimates in the table below. Round all estimates to 1 decimal place.
Coefficient estimates |
P1 |
P2 |
P3 |
P4 |
P5 |
Intercept () |
|||||
MACHINE TIME () |
|||||
ASSEMBLY TIME () |
|||||
FINISH TIME () |
The fixed costs associated with the production of each unit of is the same under the regular and the special run, but the special run costs per minute for machining, assembly, and finishing are 2 times the regular run costs.
Task 4: (5 Points) Use the parameters estimated in task 3 and the average machining time, assembly time, and finishing time (estimated in task 2) to compute the cost for producing each unit of product in a special run as . |
Present the estimates in the following format, rounding costs to the nearest dollar:
Product type |
P1 |
P2 |
P3 |
P4 |
P5 |
Special production cost per unit () |
Estimation of demand
The text file “demand.csv†contains the retailer’s sales data by region (‘North’, ‘South’, ‘East’, ‘West’) for the 5 products over the last 52 periods. It has 1040 records and 4 columns: period, productcode, region, and sales.
Each row may be interpreted as follows: sales is the sales for product with specified productcode from given region in that period. The sum of the total sales for a product from all 4 regions in each period is taken to be demand for the product in that period.
Task 5:(5 points) For each product, obtain the demand for the product in each the 52 periods. |
Task 6: (10 Points) Use the data obtained in task 5 to predict demands in time period 53 for each product . |
You should consider various prediction and forecasting methods that you are familiar with. Use the method that you think is best suited for estimating demands. In your report, please present the estimates for time period 53 in the following format:
Product type |
P1 |
P2 |
P3 |
P4 |
P5 |
Demand () in period 53 |
Optimal LP Solution:
I have also posted a document with suggestions on how you should proceed with your final report and detailed guidance on the tasks. You may wish to use this opportunity to accomplish task 10 using Python or R libraries. Task 7: (10 Points) Solve the LP formulated in Task 1 using the parameters estimated in Tasks 2, 4, and 6 to determine the optimal production plan for period 53. |
Report the minimum production cost achievable, number of units of each product type to be produced under the regular and special production runs, and the resources used during regular run in the following format:
Minimum cost attainable: |
Number of units produced |
P1 |
P2 |
P3 |
P4 |
P5 |
Regular Run |
|||||
Special Run |
Resources in regular run |
Minutes used |
MACHINE TIME |
|
ASSEMBLY TIME |
|
FINISH TIME |
Sensitivity Analysis:
Task 8. (3+12 = 15 Points). Perform sensitivity analysis by changing one parameter at a time (leaving all other parameters fixed at the values used in Task 7) and answer the following questions.
|
Quality Control
The text file “defective.csv†contains 2 columns. The first column defectiveidis an identifier, and the second column serialnbr specifies the serial number of a defective product.
The text file “quality.csv†contains 11 columns containing data from quality control tests run on batches of items produced. The column batchnbr specifies a batch number and test1, test2, test3, test4, test5, test6, test7, test8, test9, and test10 are the results for 10 tests for that batch.
Recall that batchnbr in the PRODUCTION file specifies the batch in which a product with given serialnbr is produced. Any batch that contains more than one defective items is deemed to be of poor quality; a batch with at most one defective item is considered to be of good quality.
Task 9: (10 Points) Formulate an SQL query that lists all columns from the QUALITY table and adds a derived column batchquality that contains “poor†if the batch is of poor quality (contains at least 2 defective items) and “good†otherwise. |
In your report, include:
1.The SQL query for task 9
2.The results of the query in a file qualityInput.csv.
Task 10: (10 Points) Partition the data obtained from Task 9 to train and test a Classification Tree that predicts batchquality based on values of the features test1, test2, test3, test4, test5, test6, test7, test8, test9, and test10. Use 80% of the observations for training and validation purposes; the remaining 20% should be used for testing. |
In your report:
1.Specify the number of training and test examples that you used.
2.Specify the rules that you obtained in Task 10 in the canonical form:
IF …. THEN …
3.Present the classification accuracy of this set of rules for the training set and the test set. Also present the confusion matrices in the form:
Training set:
Accuracy = _____%.
Confusion matrix:
Number of batches |
Actual Good Quality |
Actual Poor Quality |
Predicted Good Quality |
||
Predicted Poor Quality |
Test set:
Accuracy = _____%.
Confusion matrix:
Number of batches |
Actual Good Quality |
Actual Poor Quality |
Predicted Good Quality |
||
Predicted Poor Quality |
If you wish, you may also use other prediction and classification methods (such as Logistic Regression, Neural Nets, Support Vector Machines, and Discriminant Analysis) to classify batchquality based on values of the features test1, test2, test3, test4, test5, test6, test7, test8, test9, and test10. Comment on the classification accuracy of these methods for the training set and the test set.