Ian Beckett

RSS feed

    Recent comments

    Authors

    check for index fragmentation in SQL 2005

    Query to check for index fragmentation in SQL 2005 (shoot for below 10% fragmentation):

    -- Replace "YourDatabase" and "YourTable" with your specfic object names

    USE YourDatabase

    SELECT  CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName',
     CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName',
     INDEX_ID,
     CAST(INDEX_TYPE_DESC AS VARCHAR(20)) AS INDEX_TYPE_DESC,
     AVG_FRAGMENTATION_IN_PERCENT   
    FROM   SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('YourDatabase'),OBJECT_ID('dbo.YourTable'),NULL,NULL,NULL );


    Posted by ibeckett on Friday, May 30, 2008 4:14 PM
    Permalink | Comments (1) | Post RSSRSS comment feed

    Page Scraping Options

    I am working on some fun expiriments in data collection that will require massive automated page scraping... below is a quick list of initial web research resources.

    Page scrape in ASP.NET: http://www.4guysfromrolla.com/webtech/070601-1.shtml 

    "Web Scrape" free page scrape utility: http://www.webscrape.com/

    External links (from http://en.wikipedia.org/wiki/Screen_scraping)

  • PHP & cURL Screen Scraping Tutorials
  • PHP scraping Web site about web scraping using PHP
  • Data extraction for Web 2.0: Screen scraping in Ruby/Rails - Article about web scraping using Ruby
  • Screen-scraping with WWW::Mechanize - Article about web scraping using Perl
  • How to write screen scrapers - Article on writing Javascript-based screen scrapers
  • Creating XML Web Services That Parse the Contents of a Web Page - Microsoft MSDN article
  • Three common methods for data extraction - Article from a blog about Screen Scraping
  • FEAR-less Site Scraping - An article about how to do screen scraping using FEAR::API
  • Web scraping with Java - Article about web scraping using the Java programming language (requires commercial library)
  • Web scraping with PHP and Tcl - Articles about web scraping using PHP and Tcl
  • TTSS. Rapid implimentation of Scanning systems. Since 1991 Inovators in Scanning Airlines and Tour Operators Systems
  • Techreform - web scraping - A commercial provider of web scraping services based in the United Kingdom
  • OutWit Technologies - Publishers of a Web Collection Engine for Firefox
  • Piggy Bank - A joint project by W3C and MIT

  • Posted by ibeckett on Friday, May 30, 2008 2:55 AM
    Permalink | Comments (0) | Post RSSRSS comment feed

    analytic functions introduced with SQL 2005

    New Analytic Functions

    Several new analytic functions provide basic analytic capabilities within Transact-SQL. These functions will be useful in data warehouses that allow user queries into the relational database rather than exclusively through Analysis Services. Also, these complex calculations are commonly used during data staging to develop valuable data attributes.

    ROW_NUMBER. Returns the sequential row number of the result set.

    RANK. Returns the rank of rows in a result set. RANK would be identical to ROW_NUMBER on an ordered set, but for the treatment of rows that tie. All rows with the same ordered value receive the same rank. The next rank matches up with the ROW_NUMBER again. In other words, if there is a two-way tie for first place, rows one and two receive RANK=1, and row three receives RANK=3. No rows receive RANK=2.

    DENSE_RANK. Returns the rank of rows in a result set. The DENSE_RANK function is similar to RANK, but squeezes out the gaps left by the RANK function. In the sample above, rows one and two receive RANK=1, and row three receives RANK=2.

    NTILE. Divides an ordered set into the specified number of groups of approximately equal size.

    source: http://www.microsoft.com/technet/prodtechnol/sql/2005/dwsqlsy.mspx


    Posted by ibeckett on Thursday, May 29, 2008 5:31 PM
    Permalink | Comments (0) | Post RSSRSS comment feed

    What steps are required to prepare a database for initiation of a mirroring session in SQL 2005?

    1.       Identify the Principal, Mirror, and Witness 

    2.       Create a certificate for authentication between the end points on the Principal/Mirror 

    3.       Create the end points on the Principal/Mirror 

    4.       Set the Principal and Mirror recovery models to "Full" or else you will get an error 

    5.       Ensure that the Principal and Mirror are synchronized by backing up the Principal and applying the backup to the mirror using the "no recovery" option


    Posted by ibeckett on Tuesday, May 20, 2008 4:08 AM
    Permalink | Comments (0) | Post RSSRSS comment feed

    FTP component fails in SSIS 2005 even though the connection manager appears to be working

    Problem: FTP component fails with error "unable to connect" even though when testing the connection manager you are able to connect.

    Solution:
    1. Add "SSIS Debug Host" to firewall exceptions

    2. Add visual studio / BIDS to firewall exceptions

    3. Make sure package security property "ProtectionLevel" is set to some setting that will keep the FTP password encrypted (anything other than "DontSaveSensitive").  If you have it set to the common option "DontSaveSensitive" then your FTP task will fail.

    Tags: ,
    Categories: SQL 2005
    Posted by ibeckett on Monday, May 05, 2008 10:31 AM
    Permalink | Comments (0) | Post RSSRSS comment feed

    accessing SharePoint document libraries using UNC in Windows Server 2003

    Recently I wrote a .net console script to automate some stuff in Excel and part of the script includes uploading all of the contents of a given directory to a SharePoint document library.  After moving the script to production it did not work. Turns out that you need to have the WebClient service enabled. This service is disabled by default in Server 2003 so you will need to set it to Automatic or Manual and then start the service.

    After starting the WebClient service you can access the SharePoint library like: \\sharepoint\site1\documentlibrary


    Posted by ibeckett on Sunday, May 04, 2008 5:19 AM
    Permalink | Comments (0) | Post RSSRSS comment feed

    be sure to kill rogue processes when debugging SSIS packages in BIDS 2005

    Recently I have been dealing with a very annoying issue when debugging large SSIS packages using busines intelligence development studio.  Child packages called from parent package will fail randomly with an error something like "package already being used or locked by another process".  Turns out the solution is to kill left over rogue processes that may still be running from prior executions.

    Jamie Thompson at blogs.conchango.com explains:

    Packages in BIDS are executed by a process called DtsDebughost.exe. Child packages called from there are executed by dtshost.exe. These are managed processes and can sometimes be left as running processes (viewable in Task Manager) after the packages have actually completed. I'm guessing this is because they are waiting for garbage collection.
    Anyway, I am guessing these might be the problem. You could kill the processes manually if they are causing a problem.

    May 1, 2008 16:18 

     source: http://blogs.conchango.com/jamiethomson/archive/2005/05/16/1414.aspx?CommentPosted=true#commentmessage


    Posted by ibeckett on Thursday, May 01, 2008 1:22 PM
    Permalink | Comments (0) | Post RSSRSS comment feed

    XML in 10 points

    XML, XLink, Namespace, DTD, Schema, CSS, XHTML ... If you are new to XML, it may be hard to know where to begin. This summary in 10 points attempts to capture enough of the basic concepts to enable a beginner to see the forest through the trees. And if you are giving a presentation on XML, why not start with these 10 points?

    http://www.w3.org/XML/1999/XML-in-10-points.html


    Categories: XML
    Posted by ibeckett on Thursday, May 01, 2008 7:40 AM
    Permalink | Comments (0) | Post RSSRSS comment feed