Haskellで文字列を分割する方法は？

Question

Haskellで文字列を分割する標準的な方法はありますか？

linesおよびwordsは、スペースまたは改行での分割から優れた機能を発揮しますが、コンマで分割する標準的な方法は確かにありますか？

Hoogleで見つけることができませんでした。

具体的には、split "," "my,comma,separated,list"が["my","comma","separated","list"]を返すものを探しています。

Jonno_FTW · Accepted Answer

cabal install split

次のように使用します。

ghci> import Data.List.Split ghci> splitOn "," "my,comma,separated,list" ["my","comma","separated","list"]

一致する区切り文字で分割したり、いくつかの区切り文字を使用したりするための他の多くの関数が付属しています。

Steve · Answer

Prelude関数の定義を検索できることを忘れないでください！

http://www.haskell.org/onlinereport/standard-prelude.html

そこを見ると、wordsの定義は、

words :: String -> [String] words s = case dropWhile Char.isSpace s of "" -> [] s' -> w : words s'' where (w, s'') = break Char.isSpace s'

そのため、述語を取る関数に変更します：

wordsWhen :: (Char -> Bool) -> String -> [String] wordsWhen p s = case dropWhile p s of "" -> [] s' -> w : wordsWhen p s'' where (w, s'') = break p s'

次に、必要な述語で呼び出します！

main = print $ wordsWhen (==',') "break,this,string,at,commas"

Emmanuel Touzery · Answer

Data.Textを使用する場合、splitOnがあります。

http://hackage.haskell.org/packages/archive/text/0.11.2.0/doc/html/Data-Text.html#v:splitOn

これはHaskellプラットフォームに組み込まれています。

したがって、たとえば：

import qualified Data.Text as T main = print $ T.splitOn (T.pack " ") (T.pack "this is a test")

または：

{-# LANGUAGE OverloadedStrings #-} import qualified Data.Text as T main = print $ T.splitOn " " "this is a test"

evilcandybag · Answer

モジュールText.Regex（Haskellプラットフォームの一部）には、関数があります：

splitRegex :: Regex -> String -> [String]

正規表現に基づいて文字列を分割します。 APIは Hackage にあります。

antimatter · Answer

splitを使用するData.List.Splitを使用します。

[me@localhost]$ ghci Prelude> import Data.List.Split Prelude Data.List.Split> let l = splitOn "," "1,2,3,4" Prelude Data.List.Split> :t l l :: [[Char]] Prelude Data.List.Split> l ["1","2","3","4"] Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read } Prelude Data.List.Split> let l2 = convert l Prelude Data.List.Split> :t l2 l2 :: [Integer] Prelude Data.List.Split> l2 [1,2,3,4]

fuz · Answer

これを試してください：

import Data.List (unfoldr) separateBy :: Eq a => a -> [a] -> [[a]] separateBy chr = unfoldr sep where sep [] = Nothing sep l = Just . fmap (drop 1) . break (== chr) $ l

単一の文字に対してのみ機能しますが、簡単に拡張可能でなければなりません。

Frank Meisschaert · Answer

split :: Eq a => a -> [a] -> [[a]] split d [] = [] split d s = x : split d (drop 1 y) where (x,y) = span (/= d) s

例えば。

split ';' "a;bb;ccc;;d" > ["a","bb","ccc","","d"]

末尾の区切り文字が1つ削除されます。

split ';' "a;bb;ccc;;d;" > ["a","bb","ccc","","d"]

Robin Begbie · Answer

昨日Haskellを学び始めたので、間違っている場合は修正してください。

split :: Eq a => a -> [a] -> [[a]] split x y = func x y [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if y==x then func x ys ([]:(z:zs)) else func x ys ((y:z):zs)

与える：

*Main> split ' ' "this is a test" ["this","is","a","test"]

または多分あなたが欲しかった

*Main> splitWithStr " and " "this and is and a and test" ["this","is","a","test"]

次のようになります：

splitWithStr :: Eq a => [a] -> [a] -> [[a]] splitWithStr x y = func x y [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then func x (drop (length x) (y:ys)) ([]:(z:zs)) else func x ys ((y:z):zs)

Evi1M4chine · Answer

スティーブの答えにコメントを追加する方法がわかりませんが、私はお勧めします
GHCライブラリのドキュメント、
そして具体的には
Data.Listのサブリスト関数

単なるHaskellのレポートを読むよりも、リファレンスとしてはるかに優れています。

一般に、フィードする新しいサブリストをいつ作成するかについてのルールを持つフォールドも解決するはずです。

fp_mora · Answer

スペースを1文字で直接置換するものを何もインポートしない場合、wordsのターゲットセパレーターはスペースです。何かのようなもの：

words [if c == ',' then ' ' else c|c <- "my,comma,separated,list"]

または

words let f ',' = ' '; f c = c in map f "my,comma,separated,list"

これをパラメーター付きの関数にすることができます。次のように、パラメータcharacter-to-match my matching manyを削除できます。

 [if elem c ";,.:-+@!$#?" then ' ' else c|c <-"my,comma;separated!list"]

Irfan Hamid · Answer

回答で与えられた効率的で事前に構築された関数に加えて、私は自分の時間で言語を学ぶために書いていたHaskell関数のレパートリーの一部である独自のものを追加します：

-- Correct but inefficient implementation wordsBy :: String -> Char -> [String] wordsBy s c = reverse (go s []) where go s' ws = case (dropWhile (\c' -> c' == c) s') of "" -> ws rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws) -- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';') wordsByF :: String -> (Char -> Bool) -> [String] wordsByF s f = reverse (go s []) where go s' ws = case ((dropWhile (\c' -> f c')) s') of "" -> ws rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws)

ソリューションは少なくとも末尾再帰であるため、スタックオーバーフローは発生しません。

Andrey · Answer

Ghciの例：

> import qualified Text.Regex as R > R.splitRegex (R.mkRegex "x") "2x3x777" > ["2","3","777"]