Hi,
select regexp_substr('10.3.4.211 - 65.123.101.114 - - [21/May/2013:15:54:32] "GET /V','"\w+\s(/[^/]+)',1,1,' ',1) from dual
The above query is /V but I want only V..
Can you please check and modify regular expression accordingly?
Thanks in Advance
Regular Expressions
Moderator: NorbertKrupa
Re: Regular Expressions
Hi,
I ran the below query
copy cdn.inbound_access_log1
(
record filler varchar(65000),
EndPoint_IP_Address as split_part(record,' ', 1),
Client_IP_Address as split_part(record,' - ', 2),
Request_Time as to_timestamp(regexp_substr(record, '\[([^]]+)]', 1,1,'',1), 'DD/Mon/YYYY:HH:MI:SS'),
HTTP_Request_Method as regexp_substr(record, '"([^\s]+)',1,1,'',1),
Delivery_Type as regexp_substr(record, '"\w+\s/([^w+])',1,1,'',1),
Delivery_Format as regexp_substr(record, '"\w+\s(/[^/]+)/([^/]+)',1,1,'',2),
Bucket_ID as regexp_substr(record,'"\w+\s(/[^/]+)(/[^/]+)/([^/]+)',1,1,'',3)::int
)
from '/home/cdn/spain_samples/SENT_access_2013-05-21_T15.59.31_213.4.28.244.log' direct abort on error;
The above query is working fine whenever the bucket_id number is there.
Input file is
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:57:05] "GET /V/0/11573/Granturismo1.mp4 HTTP/1.0 -" 200 11171317 1048576 2 1 "Wget/1.11.4"
[213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:14] "GET /X/0/crossdomain.xml HTTP/1.1 -" 200 307 307 0 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW6
4; Trident/6.0)"[/size]
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:24] "GET /V/0/11573/Granturismo1.mp4?start=0 HTTP/1.1 http://democdn.biz.tm/player-5.1.897.swf" 200 11125709 111257
09 10 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:59:09] "GET /V/0/11467/The.Dark.Knight.Rises.2012.officialTrailer.ism/Manifest HTTP/1.0 -" 200 9436 9436 0 0 "Wget/1.1
1.4"
The above query is giving error because of this record is not having bucket_id.
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:14] "GET /X/0/crossdomain.xml HTTP/1.1 -" 200 307 307 0 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW6
4; Trident/6.0)"[/size]
How to load these records which is having missing fields values ?
How to redirect these records to rejected file path?
I want to redirected these records to error file whever is not loaded and reload after that.
I ran the below query
copy cdn.inbound_access_log1
(
record filler varchar(65000),
EndPoint_IP_Address as split_part(record,' ', 1),
Client_IP_Address as split_part(record,' - ', 2),
Request_Time as to_timestamp(regexp_substr(record, '\[([^]]+)]', 1,1,'',1), 'DD/Mon/YYYY:HH:MI:SS'),
HTTP_Request_Method as regexp_substr(record, '"([^\s]+)',1,1,'',1),
Delivery_Type as regexp_substr(record, '"\w+\s/([^w+])',1,1,'',1),
Delivery_Format as regexp_substr(record, '"\w+\s(/[^/]+)/([^/]+)',1,1,'',2),
Bucket_ID as regexp_substr(record,'"\w+\s(/[^/]+)(/[^/]+)/([^/]+)',1,1,'',3)::int
)
from '/home/cdn/spain_samples/SENT_access_2013-05-21_T15.59.31_213.4.28.244.log' direct abort on error;
The above query is working fine whenever the bucket_id number is there.
Input file is
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:57:05] "GET /V/0/11573/Granturismo1.mp4 HTTP/1.0 -" 200 11171317 1048576 2 1 "Wget/1.11.4"
[213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:14] "GET /X/0/crossdomain.xml HTTP/1.1 -" 200 307 307 0 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW6
4; Trident/6.0)"[/size]
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:24] "GET /V/0/11573/Granturismo1.mp4?start=0 HTTP/1.1 http://democdn.biz.tm/player-5.1.897.swf" 200 11125709 111257
09 10 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:59:09] "GET /V/0/11467/The.Dark.Knight.Rises.2012.officialTrailer.ism/Manifest HTTP/1.0 -" 200 9436 9436 0 0 "Wget/1.1
1.4"
The above query is giving error because of this record is not having bucket_id.
213.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:14] "GET /X/0/crossdomain.xml HTTP/1.1 -" 200 307 307 0 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW6
4; Trident/6.0)"[/size]
How to load these records which is having missing fields values ?
How to redirect these records to rejected file path?
I want to redirected these records to error file whever is not loaded and reload after that.