日志文件和 Power Pivot - 将 DATETIME 拆分成不同的列?

Log files & Power Pivot - split DATETIME out into distinct columns?

我正在将一些 IIS 日志导入 Power Pivot 以使用以下方法进行一些分析:

LogParser.exe "
SELECT 
EXTRACT_TOKEN(LogFileName, 5, '\') As LogFile,
LogRow,
to_localtime(to_timestamp(date,time)) as LOG_DTTM,
cs-UserName as ClientUserName,
cs-Method,cs-Uri-Stem as UriStem,
cs-Uri-Query as UriQuery,
sc-Status as Status,
sc-SubStatus as SubStatus,
time-Taken as ElapsedTimeMS,
c-Ip As ClientIP,
s-ComputerName as ComputerName,
s-Ip as ServerIP,
s-Port as Port,
sc-Win32-Status as Win32Status,
cs(User-Agent) as UserAgent 
    INTO IIS_LOG_PROD_STAGING 
FROM somefile.log" -o:SQL -oConnString:"Driver=SQL Server;Server=MY_SERVER_NAME; Database=MY_DATABASE_NAME;Trusted_Connection=yes" -createTable:ON -e:10 -transactionRowCount:-1

...我的问题是: 我应该在数据库存储级别将 DateTime 列的离散部分拆分为单独的列,还是应该留给 PowerPivot 模型中的计算列?

Marco Russo 似乎建议至少将 DATE 拆分到一个单独的列中:
http://sqlblog.com/blogs/marco_russo/archive/2011/09/01/separate-date-and-time-in-powerpivot-and-bism-tabular.aspx

PowerPivot 仍将列读取为 DateTime,但 hour/minute/seconds 消失并且唯一值的数量减少为数据中的不同天数。当然,使用日历 table 可以更轻松地加入!

这似乎有道理。但是,如果我 知道 我想在 HourOfDay、DayOfWeek、DayOfMonth 等级别进行分析,我是否也应该将它们拆分到单独的数据库列中?

我强烈建议创建日期 table 和时间 table 来进行此类分析。 date table will help with the day of week, day of month, etc. calculations. It allows you to easily do date calculations and categorizations through simple joins. The time dimension will group by hours. I tend to create these tables in my database and pull them into my Power Pivot model from SQL Server. My general thought is row-level calculations are more efficiently done in lower levels (SQL Database) than in the Power Pivot model. They can be done in both, so the location is up to you and the amount of memory and CPU available on the server and on the computer running the Power Pivot model. Since Power Pivot is opened on individual laptops and I can't control those, I like to do a lot of computation in SQL Server. I see you tagged Power Query. There are scripts available to create a date dimension in Power Query without needing a table in SQL Server. I haven't built a time dimension in Power Query yet, but here's a good SQL Server script。 日期table是日期级别。时间 table 减少到秒,让您可以轻松地按分钟、小时等向上滚动时间。

这是 link 的日期 table:

CREATE TABLE [dbo].[DimDate] (
    [DateKey] [int] NOT NULL
    ,[Date] [datetime] NOT NULL
    ,[Day] [char](10) NULL
    ,[DayOfWeek] [smallint] NULL
    ,[DayOfMonth] [smallint] NULL
    ,[DayOfYear] [smallint] NULL
    ,[PreviousDay] [datetime] NULL
    ,[NextDay] [datetime] NULL
    ,[WeekOfYear] [smallint] NULL
    ,[Month] [char](10) NULL
    ,[MonthOfYear] [smallint] NULL
    ,[QuarterOfYear] [smallint] NULL
    ,[Year] [int] NULL
    );

现在是时间 table:

create table time_of_day 
( 
     time_of_day_key smallint primary key, 
     hour_of_day_24 tinyint,                --0-23, military/European time 
     hour_of_day_12 tinyint,                --1-12, repeating for AM/PM, for us American types 
     am_pm char(2),                         --AM/PM 
     minute_of_hour tinyint,                --the minute of the hour, reset at the top of each hour. 0-59 
     half_hour tinyint,                     --1 or 2, if it is the first or second half of the hour 
     half_hour_of_day tinyint,              --1-24, incremented at the top of each half hour for the entire day 
     quarter_hour tinyint,                  --1-4, for each quarter hour 
     quarter_hour_of_day tinyint,           --1-48, incremented at the tope of each half hour for the entire day 
     string_representation_24 char(5),      --military/European textual representation 
     string_representation_12 char(5)       --12 hour clock representation sans AM/PM 
) 

即使您并不是真正在创建维度模型,拥有这些 table 也会有所帮助。