Sunday, September 18, 2005
Updating Dates Table in MySql using Microsoft VB .Net
Try
Dim dateConvert As New mySQLlDate
conn = New MySqlConnection
conn.ConnectionString = "server=" & dbServer & "; user id=" & dbUser & "; password=" & dbPassword & "; database=" & dbName
conn.Open()
Dim updateCommand As New MySqlCommand("UPDATE GIDNET_AVAILDATE " _
& "SET UTIL_DATE_FROM =@UTIL_DATE_FROM ," _
& "UTIL_DATE_TO =@UTIL_DATE_TO," _
& "UTIL_LAST_TICKET_DATE = @UTIL_LAST_TICKET_DATE WHERE AD_ID = @AD_ID", conn)
With updateCommand.Parameters
.Clear()
.Add("@UTIL_DATE_FROM", dateConvert.convertDate(strDateFrom))
.Add("@UTIL_DATE_TO", dateConvert.convertDate(strDateTo))
.Add("@UTIL_LAST_TICKET_DATE", dateConvert.convertDate(strLastTck))
.Add("@AD_ID", DateID)
End With
updateCommand.ExecuteNonQuery()
conn.Close()
Catch myerror As MySqlException
'TestResult.Text = "Error Connecting to Database: " & myerror.Message
Finally
conn.Dispose()
End Try
16:45 Posted in Database | Permalink | Comments (0) | Email this
Sunday, August 21, 2005
About SQL Server Services
1- MSSQLServer Service
It is the Database engine itself. It helps in managing, verifying data and data integrity, and processing the transactions.
2- SQL Server Agent
1. Jobs: Group of Transact-SQL statements that has to be executed in a specific recurrence or at specific time.
2. Alerts: Actions must be taken when a specific event occurs such as emailing the administrator when the server hard-drive runs out of space.
3. Operators: A defined user who has the ability to manage the server; usually he will be the target of the alert.
3- MS DTC (Distributed Transaction Coordinator)
Service used by the server that enables transactions to include more than one server, e.g. running a Transact-SQL statement against tow servers.
4- Microsoft Search Service
Enables conducting a full text search against all the available records.
09:10 Posted in Database | Permalink | Comments (0) | Email this
Monday, August 15, 2005
Two Ways to Transfer Data
Source (http://www.kdkeys.net/)
I believe most of us are familiar with one way of transferring the rows in one table to the rows in another table provided the target table exists. We do this by using the INSERT INTO … SELECT FROM statement. This is a way to quickly use SQL for mass inserting and it allows us to apply filters through the where clause in the select statement.
But what if the target table does not exist and you do not have time or want to take the time to build it? In this case, you would want to use the SELECT INTO statement for transferring the rows in a source table to a target table and build the target table at the same time.
For example, we have an art works database and we want to take all the rows currently in the Works table and copy them to the Works_Temp table. The Works_Temp table must already exist and the SQL would look like this:
USE Art
GO
Select * INTO Works_Temp from Works;
However Works_Temp is just that: temporary. We will probably drop it when we are through. We have to do some maintenance on the source table and do not have time right now to build the target. Here is the statement we would use if the target table does not exist:
INSERT INTO Works_Temp
SELECT * FROM Works;
It saves a lot of time.
13:30 Posted in Database | Permalink | Comments (0) | Email this
Wednesday, August 03, 2005
Choose the Right Recovery Model for Your Database
I had a question the about recovery models for SQL databases. This is the setting that you are shown when you do a right-click on a database in Enterprise Manager and view the properties. It is under the Options tab in the database properties dialogue window. You have 3 choices for a recovery model on your database:
1. Simple
2. Full
3. Bulk-logged.
The way I interpret the “Simple” model is that it would be perfect for databases that rarely change. In other words, these are “static” databases. Why? It is because with this model, transaction logs do not get backed up and therefore do not come into the picture upon recovery. You basically get a snapshot of your database at the time it was backed up without any logs to apply. For example, a database with zip codes would not be one that changes too often and a simple recovery model is probably all that is required.
“Full” does just what it says: backs up everything – logs and data. It is important to use this when you have a database that is heavy in transactions and you need to recover to a certain point in time.
“Bulk-logged” took me some time to understand but let me sum it up like this: Bulk-logged means that certain types of bulk load operations are not logged to the detailed extent as other transactions. Only references are created to the actual data records in the raw data files. Therefore, you have point-in-time recovery for the most part but if you lose one of those datafiles where the references point to then you may have to do your bulk operations again as part of the recoveyr. This might not be a good idea if those manual data loads take hours to complete.
08:45 Posted in Database | Permalink | Comments (0) | Email this
Tuesday, August 02, 2005
Physical Operator: Merge Joins
Another physical operator that is used by SQL Optimizer for computing joins. In the ensted loop join as I told you all previously it will take the cost of performing join will be m X n where m is number of rows in outer and n is the rows in inner table. SQL Optimizer uses Merge join in the following case
1. If clustered index is there on both the columns on which join is being performed.
2. There is one-many relationship between both the table.
In that case optimizer uses Merge join for computing join. It result in the time complexity og m + n.
Let see how
As both the table have clustered index on the column on which join operation is performed, the tables will be sorted on that column. So merge join will begin by comparing each row of one table with another. As both are sorted it will do the following
1. If both values are equal i.e outer and inner then its added to the result set
2. If outer table is smaller than the inner value, then comparision stops and next value of outer table is taken.
3. If outer table is greater than the inner value, then again the comparision stops and next value from outer table is taken
09:35 Posted in Database | Permalink | Comments (0) | Email this
Saturday, July 30, 2005
Physical Operator: Nested Loop
When we have join in queries, the SQL optimizer uses one of the following three physical operator to perform the join
1. Nested Loop
2. Merge
3. Hash
Nested Loop
In nested loop, while computing the join each row of one table is match with all the rows of the other table. The tables are termed as inner and outer table. The table that is on the outer side is the outer table and table which is matched for each row is called the inner table. So if we have mrows in outer table and n rows in the inner table the number of comparision are m X n.
For e.g.
Suppose there are two tables EMP and DEPT storing employee and department in which employee belongs respectively. Suppose Emp has 10 records and Dept has 5 records. Let do a inner join
SELECT EMP.EMPNAME, DEPT.DNAME,EMP.DATEOFBIRTH FROM EMP INNER JOIN DEPT ON EMP.DPETID = DEPT.DEPTID
In the above query if optimizer decides Nested Loop, then each record of employee the DpetID is match with all the rows of Dept table and if thee is a match it is part of the final result set.
19:05 Posted in Database | Permalink | Comments (0) | Email this
Friday, July 29, 2005
Physical Operator: Index Scan and Seek
Index Scan
Index Scan means performing a full scan on Index Pages. This is especially done for table that have non clustered index on them.
In case of Non Clustered Index the leaf pages contain links to the rows in the data pages whereas in the Clustered index the leaf page contain the row itself. So when we execute query and Optimizer decides to use Index Scan then it traverses though all the leaf pages and fetching the rows from the data pages to the resultset.
Index Seek
It is similar to the Index Scan, the only difference being instead of traversing all the rows of the leaf pages it uses the SEEK functionality of Query Optimizer to fetch record from the Non clustered index. So it is more better than the Index Scan as it does not do traversal of all the rows in the Non clustered index leaf pages.
09:25 Posted in Database | Permalink | Comments (0) | Email this
Tuesday, July 26, 2005
Re: Need EM algorithm in C/C++
Just for grins, here is an old EM code I had laying around that finds a maximum likelihood solution for a Gaussian mixture in R^n. It is missing a macro library that supports dynamic vector allocation. It is similar to vector in C++. There are two C files here, emclust.c and vector.c and one header file, vector.h. If this helps you, good. If not, tough.
------------------------ vector.h -----------------------
float dot(float *a, float *b, int dim);
float norm(float *a, int dim);
void normalize(float *a, int dim);
void axpy(float *r, float a, float *x, float *y, int n);
float *make_vector(int n);
void random_vector(float *a, int dim);
------------------------ vector.c -----------------------
#include
#include
#include
void random_vector(float *a, int dim)
{
int i;
for (i=0;i
a[ i ] = 2*drand48() - 1;
}
}
float dot(float *x, float *y, int dim)
{
float r;
int i, even;
r = 0;
even = dim >> 2;
for (i=0;even;even--) {
r += x[ i ]*y[ i ] + x[i+1]*y[i+1] + x[i+2]*y[i+2] + x[i+3]*y[i+3];
i += 4;
}
for (;i
r += x[ i ]*y[ i ];
}
return r;
}
double norm(float *a, int dim)
{
return sqrt(dot(a, a, dim));
}
void normalize(float *a, int dim)
{
int i;
double mag;
mag = norm(a, dim);
for (i=0;i
a[ i ] /= mag;
}
}
void axpy(float *r, float a, float *x, float *y, int n)
{
int i;
if (a == 0 || !x) {
if (y) {
memcpy(r, y, n * sizeof(float));
}
else {
for (i=0;i
r[ i ] = 0;
}
}
}
else if (x) {
int even;
even = n >> 2;
if (y) {
for (i=0;even;even--) {
r[i ] = a*x[i ] + y[i ];
r[i+1] = a*x[i+1] + y[i+1];
r[i+2] = a*x[i+2] + y[i+2];
r[i+3] = a*x[i+3] + y[i+3];
i += 4;
}
for (;i
r[ i ] = a*x[ i ] + y[ i ];
}
}
else {
for (i=0;even;even--) {
r[i ] = a*x[i ];
r[i+1] = a*x[i+1];
r[i+2] = a*x[i+2];
r[i+3] = a*x[i+3];
i += 4;
}
for (;i
r[ i ] = a*x[ i ];
}
}
}
}
float *make_vector(int n)
{
return calloc(n, sizeof(float));
}
------------------------ emclust.c -----------------------
#include
#include
#include
#define DVECTOR_BASIC
#include
#include "vector.h"
/* use estimate maximization optimization to cluster data in R^n using
the euclidean metric
this is just an attempt to estimate the parameters of a mixed gaussian
distribution
the recurrences used are
mean1 = sum w1[ i ] * (x[ i ]) / sum w1[ i ]
var1 = sqrt (sum w1[ i ] * (x[ i ]-mean1)^2 / sum w1[ i ])
mean2 = sum w2[ i ] * (x[ i ]) / sum w2[ i ]
var2 = sqrt (sum w2[ i ] * (x[ i ]-mean2)^2 / sum w2[ i ])
l* N(mean1, var1, x[ i ])
where w1[ i ] = -----------------------------------------------------
(l*N(mean1, var1, x[ i ]) + (1-l)*N(mean2, var2, x[ i ]))
(1-l)* N(mean2, var2, x[ i ])
and w2[ i ] = -----------------------------------------------------
(l*N(mean1, var1, x[ i ]) + (1-l)*N(mean2, var2, x[ i ]))
l is found by solving
(N(mean1, var1, x[ i ]) - N(mean2, var2, x[ i ]))
sum ----------------------------------------------------- = 0
i (l*N(mean1, var1, x[ i ]) + (1-l)*N(mean2, var2, x[ i ]))
*/
/*
* gaussian normalization constant = 1/sqrt(2 * PI)
*/
#define NORM 0.39894228040143267794
float sq(float *x, float *mean, int dim)
{
int i;
register double r, t;
r = 0;
for (i=0;i
t = x[i+0]-mean[i+0];
r += t*t;
t = x[i+1]-mean[i+1];
r += t*t;
t = x[i+2]-mean[i+2];
r += t*t;
t = x[i+3]-mean[i+3];
r += t*t;
}
for (;i
t = x[i ]-mean[i ];
r += t*t;
}
return r;
}
float normal(float *mean, float sd, float *x, int dim)
{
return pow(NORM, (double) dim) * exp( - sq(x, mean, dim)/(sd*sd*2) )/sd;
}
double *n1, *n2;
int n;
/* auxilliary function for solving for the mixing constant */
double f(double l)
{
int i;
double sum;
sum = 0;
for (i=0;i
sum += (n1[ i ]-n2[ i ])/(l*n1[ i ] + (1-l)*n2[ i ]);
}
return sum;
}
double solve(double (*f)(), double low, double high)
{
double mid, flow, fmid, fhigh;
mid = (low+high)/2;
flow = f(low);
fmid = f(mid);
fhigh = f(high);
while (high-low > 0.0001) {
if (flow > 0) {
if (fmid > 0) {
low = mid;
flow = fmid;
}
else {
high = mid;
fhigh = fmid;
}
}
else {
if (fmid <= 0) {
low = mid;
flow = fmid;
}
else {
high = mid;
fhigh = fmid;
}
}
mid = (low+high)/2;
fmid = f(mid);
}
return mid;
}
float *get_row(int dim)
{
int i;
float *r;
r = calloc(dim, sizeof(r[0]));
for (i=0;i
if (scanf("%f", &r[ i ]) != 1) {
free(r);
return NULL;
}
}
return r;
}
int main(int argc, char *argv[])
{
int ch;
extern int getopt();
extern char *optarg;
int error = 0;
int dim;
int seed;
int cluster_count = 2;
int i, j;
struct {
int count, size;
float **contents;
} x;
float *t;
float *mean1, sd1, *mean2, sd2, l;
float *m1, s1, *m2, s2;
error = 0;
dim = 1;
while ((ch = getopt(argc, argv, "d:")) != EOF) {
switch (ch) {
case 'd':
if (sscanf(optarg, "%d", &dim) != 1) {
error = 1;
}
break;
case 's':
if (sscanf(optarg, "%d", &seed) != 1) {
error = 1;
}
break;
case 'n':
if (sscanf(optarg, "%d", &cluster_count) != 1) {
error = 1;
}
break;
case '?':
error = 1;
break;
}
}
srand48(seed);
DVectorInit(&x);
t = get_row(dim);
while (t) {
DVectorPush(&x, t);
t = get_row(dim);
}
n = DVectorLength(&x);
n1 = calloc(n, sizeof(n1[0]));
n2 = calloc(n, sizeof(n2[0]));
l = 0.5;
mean1 = calloc(dim, sizeof(mean1[0]));
m1 = calloc(dim, sizeof(mean1[0]));
sd1 = 1;
mean2 = calloc(dim, sizeof(mean2[0]));
m2 = calloc(dim, sizeof(mean2[0]));
mean2[0] = 1;
sd2 = 1;
for (j=0;j<100;j++) {
double sum1, sum2, w;
for (i=0;i
n1[ i ] = normal(mean1, sd1, DVectorNth(&x, i), dim);
n2[ i ] = normal(mean2, sd2, DVectorNth(&x, i), dim);
}
axpy(m1, 0.0, NULL, NULL, dim);
sum1 = 0.0;
for (i=0;i
w = l*n1[ i ]/(n1[ i ]+n2[ i ]);
axpy(m1, w, DVectorNth(&x, i), m1, dim);
sum1 += w;
}
axpy(m1, 1/sum1, m1, NULL, dim);
sum1 = 0.0;
sum2 = 0.0;
for (i=0;i
w = l*n1[ i ]/(n1[ i ]+n2[ i ]);
sum1 += w*sq(DVectorNth(&x, i), m1, dim);
sum2 += w;
}
s1 = sqrt(sum1/sum2);
axpy(m2, 0.0, NULL, NULL, dim);
sum1 = 0;
for (i=0;i
w = (1-l)*n2[ i ]/(n1[ i ]+n2[ i ]);
axpy(m2, w, DVectorNth(&x, i), m1, dim);
sum1 += w;
}
axpy(m2, 1/sum1, m2, NULL, dim);
sum1 = 0;
sum2 = 0;
for (i=0;i
w = (1-l)*n2[ i ]/(n1[ i ]+n2[ i ]);
sum1 += w*sq(DVectorNth(&x, i), m2, dim);
sum2 += w;
}
s2 = sqrt(sum1/sum2);
l = solve(f, 0.0, 1.0);
axpy(mean1, 1.0, m1, NULL, dim);
sd1 = s1;
axpy(mean2, 1.0, m2, NULL, dim);
sd2 = s2;
for (i=0;i
printf("%10.3f ", mean1[ i ]);
}
printf("(%10.3f)n", sd1);
for (i=0;i
printf("%10.3f ", mean2[ i ]);
}
printf("(%10.3f)nn", sd2);
}
return 0;
}
23:25 Posted in Database | Permalink | Comments (0) | Email this
Monday, July 25, 2005
Renaming a database column
A column in one of your tables is inconsistently or erroneously named and needs to be changed in order to enforce a naming convention.
Motivation
In large database schemas which need to undergo serious maintenance phases, naming consistency is of outermost importance. This is an issue in other programming environments (OO programming languages for example), but in databases it is more serious due to the lack of programming abstractions at the database level (such as namespaces, for example).
Other reasons for which such a situation might occur can be found in the topic "Database Refactorings - Rename Table"
Mechanics
· Create a new column with the desired name and type. Set the “allow null” property to true if there is any data in your table
· Create a script which moves the data from the old column to the new one
· Deprecate the old column
· Set the allow null property to the appropriate value
· Find all references to the old column and rename them. Also, if the column was involved in any relational integrity constraints, recreate them. Normally, this type of change should not affect your business or interface layer. If this is not the case you need to consider refactorings in those layers too.
14:15 Posted in Database | Permalink | Comments (0) | Email this
Sunday, July 24, 2005
Replacing a database column
One of the constituent attributes of a column definition (except the name attribute) has changed and, as a consequence, the column needs to be replaced. This type of change usually involves changes to the type, precision or “allow null” properties of the column.
Motivation
Although the type of the data is generally less exposed to evolution (as compared with the relationship between different entities, for example), this is still a fairly common situation in large databases.
Some of the attributes of a column definition can be changed “on the fly”, without creating any additional scripts. However, their impact still has to be analyzed, so we will consider the most complex attribute: the type of the column. Other attributes that are likely to undergo evolution are precision, length, “allow null”, collation, etc.
Mechanics
· Create a new column with the desired attributes. Set the “allow null” property to true if there is any data in your table
· Create a script which moves the data from the old column to the new one. Make sure you perform the appropriate conversions.
· Deprecate the old column
· Set the allow null property to the appropriate value
· Find all references to the old column and rename them. Also, if the column was involved in any relational integrity constraints, recreate them. Depending on the attributes that were modified this change could propagate to your business and presentation layer
Example
You are creating an order management system. Due to some changes in the used currencies, you need to change the type of the product price from integer to real.
You are creating an application for mobile devices. The initial device was capable of displaying a maximum of 15 characters in a line. Because of this, you decided to make columns that contained data to be displayed on the mobile devices 15 character long. However, at a moment in time, the client upgrades to a newer version of the device that can display 50 characters on a line. In order to accommodate the new devices, you need to change the dimension of your columns.
21:10 Posted in Database | Permalink | Comments (0) | Email this



